Buyer's Guide

Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Investigating Hadoop distributions: Which is right for you?

Here we examine six commercial Hadoop distributions that provide the core components of the Hadoop ecosystem stack, as well as additional functionality, including enhanced features and multiple levels of support.


Companies of all sizes can use Apache Hadoop to manage the massive volumes of structured, semi-structured and unstructured data being generated by such sources as social media, Internet of Things devices and mobile sensors. The Hadoop framework comprises several open source software components with a set of core modules that capture, process, manage and analyze big data.

Although developers can download Hadoop directly from the Apache website and build an environment on their own, the open source Hadoop framework is limited. Organizations that need more robust features, maintenance and support are turning to commercial Hadoop software distributions.

Vendors bundle their enterprise Hadoop distributions with different levels of support, as well as enhanced commercial distributions. Because the software is open source, you don't purchase a Hadoop distribution as a product, but rather as an annual support subscription.

In this buyer's guide, we outline the ways commercial Hadoop distributions can benefit your organization as well as the features these offerings provide. We also analyze the top Hadoop distributions, examining key characteristics of each, including the deployment model, data protection, security and support. To help you further narrow your search, we provide in-depth descriptions of the six leading subscriptions. 

1What is?-

What do Hadoop distributions offer?

Commercial Hadoop distributions that provide performance and functionality enhancements over the core open source framework are making Hadoop more attainable than ever for organizations.


Exploring Hadoop distributions for managing big data

Companies of all sizes can use Hadoop, as vendors sell packages that bundle Hadoop distributions with different levels of support, as well as enhanced commercial distributions. Continue Reading

2Do I need?-

Making a case for a Hadoop software distribution

To help you determine if one of the commercial Hadoop distributions is right for your organization, you must first determine what applications you need to support.


How a Hadoop distribution can help you manage big data

To help you determine if a commercial Hadoop distribution could benefit your organization, consultant David Loshin examines big data use cases and applications that Hadoop can support. Continue Reading

3How to buy-

Hadoop distributions offer value-added functionality

Expert David Loshin explores some value-added supplements to the code base and key features offered by commercial Hadoop distributions, including performance and functionality capabilities, maintenance, and support.


What to consider when evaluating Hadoop vendors

Before you evaluate specific Hadoop software or subscriptions, examine what features the vendor distributions provide and how they match your big data management needs. Continue Reading

4Which should I buy?-

Which Hadoop distribution is right for my organization?

Learn what key characteristics must be considered as you evaluate the top Hadoop distributions.


Four factors for comparing the top Hadoop distributions

By examining the key characteristics presented here -- along with the top Hadoop distributions -- you can determine which subscription is right for your organization. Continue Reading

5Top product overviews-

The top Hadoop distributions

Here we provide an in-depth look at each of the six Hadoop distributions analyzed in the final article in this series. We examine the specific components; what platforms these Hadoop distributions are supported on, as well as each vendor's service and support model; and the cost of these subscriptions.


A look at Amazon Elastic MapReduce cloud-based Hadoop

The Amazon Elastic MapReduce Web service offers a managed Hadoop framework that enables users to distribute and process big data across dynamically scalable Amazon EC2 instances. Continue Reading


Learn more about the Cloudera Hadoop distribution

Cloudera distribution including Apache Hadoop provides an analytics platform and the latest open source technologies to store, process, discover, model and serve large amounts of data. Continue Reading


Inside the Hortonworks open enterprise Hadoop distribution

The Hortonworks Data Platform consists entirely of projects built through the Apache Software Foundation and provides an open source environment for data collection, processing and analysis. Continue Reading


Inside the IBM BigInsights platform for big data management

The latest version of IBM BigInsights offers several value-add services that can be used with its core distribution of open source Hadoop for managing big data. Continue Reading


Inside the Microsoft Azure HDInsight cloud infrastructure

Azure HDInsight is a cloud implementation of Apache Hadoop that provides a software framework designed for processing, analyzing and reporting on big data. Continue Reading


Inside the MapR Hadoop distribution for managing big data

The MapR Hadoop distribution replaces HDFS with its proprietary file system, MapR-FS, which is designed to provide more efficient management of data, reliability and ease of use. Continue Reading

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.