Essential Guide

Get started Bring yourself up to speed with our introductory content.

Using big data and Hadoop 2: New version enables new applications

Hadoop 2 increases the distributed processing framework's application flexibility and enhances its availability and scalability. Learn about its new features and what they mean for Hadoop users in this guide.


Early adopters of Apache Hadoop, including high-profile users such as Yahoo, Facebook and Google, had to rely on the partnership of the Hadoop Distributed File System (HDFS) and the MapReduce programming and resource management environment. Together, those technologies enabled users to process, manage and store large amounts of structured, unstructured and semi-structured data in Hadoop clusters.

But there were limitations inherent in the Hadoop-MapReduce pairing. For example, Yahoo and other users have cited issues with the first generation of Hadoop technology not being able to keep pace with the deluge of information they're collecting online because of MapReduce's batch processing format.

Hadoop 2, an upgrade released by the Apache Software Foundation in October 2013, offers performance improvements that can benefit related technologies in the Hadoop ecosystem, including the HBase database and Hive data warehouse. But the most notable addition in Hadoop 2 -- which originally was referred to as Hadoop 2.0 -- is YARN, a new component that takes over MapReduce's resource management and job scheduling duties. YARN (short for Yet Another Resource Negotiator) enables users to deploy Hadoop systems without MapReduce. Running MapReduce applications is still an option, but other kinds of programs can now be run natively as well -- for example, real-time querying and streaming data applications. The enhanced flexibility opens the door to broader uses for big data and Hadoop 2 implementations; in addition, YARN allows users to consolidate multiple Hadoop clusters into one system to lower costs and streamline management tasks. The upgrades in Hadoop 2 also boost cluster availability and scalability, two other issues that held back the first version of Hadoop.

Even with the added capabilities, Hadoop 2 still has a long way to go in moving beyond the early adopter stage, particularly in mainstream IT shops. But the new version heralds a maturing technology and a revamped concept for developing and implementing big data applications. This guide explores the features of Hadoop 2 and potential new uses for Hadoop tools and systems with insight and advice from experienced Hadoop users as well as industry analysts and consultants.

1Expanding Hadoop's uses-

Hadoop 2 breaks away from MapReduce, invites broader uses

One important change that comes with the Hadoop 2 upgrade is the separation of the Hadoop Distributed File System from MapReduce. The articles in this section explore the changing dynamics of big data and Hadoop 2 applications triggered by that breakup, as well as the role of the new YARN resource manager and Hadoop 2's other new features.


Big data users find more to do with Hadoop 2's YARN resource manager

With Hadoop 2's YARN resource manager, the distributed processing framework is no longer bound to MapReduce, which enables Hadoop systems to do more. Continue Reading


Will Hadoop 2 have wider appeal than the original version?

The release of Hadoop 2 -- and the wider range of options that come with it -- may give prospective users more reasons to move forward with adoption. Continue Reading


Examining Hadoop 2's new features: Key questions answered

Learn about the most important features of Hadoop 2, including YARN and new federation and high-availability components for the Hadoop Distributed File System. Continue Reading


Unwrapping the layers of big data: YARN, MapReduce, Hadoop clusters

MapReduce and Hadoop have matured as technologies, and together with YARN they are capable of performing faster, more flexible big data functions. Continue Reading


Storm-on-YARN signals widening gap between Hadoop applications

Yahoo's Storm-on-YARN technology makes the gap between enterprise Hadoop needs and those of top-rung Web companies more apparent. Can that gap be bridged? Continue Reading

2Hadoop trends and issues-

Maximizing the potential of Hadoop 2: Opportunities and challenges

Hadoop 2 can support applications in a wider range of programming modes and data-crunching capacities. In addition, the Hadoop framework is being tapped for involvement in areas such as mainframe modernization and mobile app development. In this section, learn about new trends in the use of Hadoop and hurdles that could get in the way of the technology -- and Hadoop users.


New applications for big data and Hadoop 2 -- new challenges, too

Hadoop 2 extends the framework's capabilities beyond batch processing and serving as a big data landing pad, but there are new issues to consider as well. Continue Reading


Evolving ecosystem around Hadoop enables new programming approaches

Hadoop 2's features include new development options that expand the potential for innovation in building big data applications. Continue Reading


Hadoop 2 gives U.K. companies new options, but usage lags U.S. levels

Many organizations in the U.K. and Europe may not be ready to take full advantage of the broader data management capabilities that Hadoop 2 has to offer. Continue Reading


A broadening concept of Hadoop: News from Hadoop Summit 2013

Learn what industry insiders and users had to say at Hadoop Summit 2013 about the increased flexibility in Hadoop 2 and potential new enterprise uses for the technology. Continue Reading


Does Hadoop have a role to play in mainframe modernization?

Hadoop clusters are threatening the data warehousing status quo. Could they also make incursions into the world of mainframe modernization and migration? Continue Reading


Integrating Hadoop for mobile application development

Get tips on how to use existing Hadoop applications to help meet the data needs of mobile application users. Continue Reading

3Hadoop and business uses-

Weighing Hadoop 2's place in business analytics and operations

In this section, discover how Hadoop 2 supports business analytics and enterprise operations -- and get advice on what's needed to make the potential uses a reality in companies wanting to take advantage of its added functions. Consultants and experienced users discuss what Hadoop 2 has to offer and what challenges stand in the way of getting valuable business benefits from the upgraded Hadoop framework.


Analytics applications on firmer ground with Hadoop 2

Consultant David Loshin explains how the separation of resource and application management in Hadoop 2 boosts its ability to support data analytics. Continue Reading


Hadoop 2 demands highly sought-after big data skills

While organizations can now deploy Hadoop 2, many don't have employees in-house with the skills needed to support the platform. Continue Reading


Can Hadoop keep up with real-time business analytics needs?

While Hadoop vendors are touting its ability to function as a real-time analytics tool, there are still questions about whether it's really suited for real-time uses. Continue Reading


Exploring best practices for using Hadoop in enterprise operations

Strata Conference panelists discuss Hadoop best practices for enterprise deployments and the ongoing evolution of Hadoop from proof of concept to operations workhorse. Continue Reading


Charting new frontiers in geospatial data with Hadoop

Learn how satellite operator Skybox Imaging is now able to find new meaning in satellite data with analytics help from Hadoop. Continue Reading


Brush up on your Hadoop 2 vocabulary

Terminology abounds in the world of Hadoop, particularly because there are so many add-on tools in the Hadoop ecosystem. In this section, you'll find definitions of various terms associated with big data and Hadoop 2.


How well do you know the Hadoop ecosystem?

Take this brief quiz to test what you know about the Hadoop framework.

Take This Quiz

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.