Investigating Hadoop distributions: Which is right for you?

Last updated:May 2016

Editor's note

Companies of all sizes can use Apache Hadoop to manage the massive volumes of structured, semi-structured and unstructured data being generated by such sources as social media, Internet of Things devices and mobile sensors. The Hadoop framework comprises several open source software components with a set of core modules that capture, process, manage and analyze big data.

Although developers can download Hadoop directly from the Apache website and build an environment on their own, the open source Hadoop framework is limited. Organizations that need more robust features, maintenance and support are turning to commercial Hadoop software distributions.

Vendors bundle their enterprise Hadoop distributions with different levels of support, as well as enhanced commercial distributions. Because the software is open source, you don't purchase a Hadoop distribution as a product, but rather as an annual support subscription.

In this buyer's guide, we outline the ways commercial Hadoop distributions can benefit your organization as well as the features these offerings provide. We also analyze the top Hadoop distributions, examining key characteristics of each, including the deployment model, data protection, security and support. To help you further narrow your search, we provide in-depth descriptions of the six leading subscriptions. 

1Making a case for a Hadoop software distribution

To help you determine if one of the commercial Hadoop distributions is right for your organization, you must first determine what applications you need to support.

2Hadoop distributions offer value-added functionality

Expert David Loshin explores some value-added supplements to the code base and key features offered by commercial Hadoop distributions, including performance and functionality capabilities, maintenance, and support.

3Which Hadoop distribution is right for my organization?

Learn what key characteristics must be considered as you evaluate the top Hadoop distributions.

4The top Hadoop distributions

Here we provide an in-depth look at each of the six Hadoop distributions analyzed in the final article in this series. We examine the specific components; what platforms these Hadoop distributions are supported on, as well as each vendor's service and support model; and the cost of these subscriptions.