Enormous data stores have become a fact of life for organizations of all types and
sizes. The ability to manipulate, transform, and drive benefit from that data—which
is often unstructured big data—is quickly becoming the norm, and the tools and techniques
for doing so are becoming more commonplace.
Hadoop is an open-source software framework written in Java* and based on Google’s
MapReduce* and distributed file system work. It is built to support distributed
applications by analyzing very large bodies of data using clusters of servers, transforming
it into a form that is more usable by those applications. Hadoop is designed to
be deployed on commonly available, general-purpose infrastructure. Tasks that the
framework is particularly well suited for include indexing and sorting large data
sets, data mining, log analytics, and image manipulation.
Colfax's ClusterEdge H2200x offers a complete turnkey solution for getting
started with Apache Hadoop. The ClusterEdge H2200x is a highly scalable, balanced
and easy to deploy platform bundled with Intel® Distribution for Apache Hadoop*
software.