Early adopters of Apache Hadoop, including high-profile users such as Yahoo, Facebook and Google, had to rely on the partnership of the Hadoop Distributed File System (HDFS) and the MapReduce programming and resource management environment. Together, those technologies enabled users to process, manage and store large amounts of structured, unstructured and semi-structured data in Hadoop clusters.
But there were limitations inherent in the Hadoop-MapReduce pairing. For example, Yahoo and other users have cited issues with the first generation of Hadoop technology not being able to keep pace with the deluge of information they're collecting online because of MapReduce's batch processing format.
Hadoop 2, an upgrade released by the Apache Software Foundation in October 2013, offers performance improvements that can benefit related technologies in the Hadoop ecosystem, including the HBase database and Hive data warehouse. But the most notable addition in Hadoop 2 -- which originally was referred to as Hadoop 2.0 -- is YARN, a new component that takes over MapReduce's resource management and job scheduling duties. YARN (short for Yet Another Resource Negotiator) enables users to deploy Hadoop systems without MapReduce. Running MapReduce applications is still an option, but other kinds of programs can now be run natively as well -- for example, real-time querying and streaming data applications. The enhanced flexibility opens the door to broader uses for big data and Hadoop 2 implementations; in addition, YARN allows users to consolidate multiple Hadoop clusters into one system to lower costs and streamline management tasks. The upgrades in Hadoop 2 also boost cluster availability and scalability, two other issues that held back the first version of Hadoop.
Even with the added capabilities, Hadoop 2 still has a long way to go in moving beyond the early adopter stage, particularly in mainstream IT shops. But the new version heralds a maturing technology and a revamped concept for developing and implementing big data applications. This guide explores the features of Hadoop 2 and potential new uses for Hadoop tools and systems with insight and advice from experienced Hadoop users as well as industry analysts and consultants.
1Expanding Hadoop's uses-
Hadoop 2 breaks away from MapReduce, invites broader uses
One important change that comes with the Hadoop 2 upgrade is the separation of the Hadoop Distributed File System from MapReduce. The articles in this section explore the changing dynamics of big data and Hadoop 2 applications triggered by that breakup, as well as the role of the new YARN resource manager and Hadoop 2's other new features.
With Hadoop 2's YARN resource manager, the distributed processing framework is no longer bound to MapReduce, which enables Hadoop systems to do more. Continue Reading
The release of Hadoop 2 -- and the wider range of options that come with it -- may give prospective users more reasons to move forward with adoption. Continue Reading
Learn about the most important features of Hadoop 2, including YARN and new federation and high-availability components for the Hadoop Distributed File System. Continue Reading
MapReduce and Hadoop have matured as technologies, and together with YARN they are capable of performing faster, more flexible big data functions. Continue Reading
Yahoo's Storm-on-YARN technology makes the gap between enterprise Hadoop needs and those of top-rung Web companies more apparent. Can that gap be bridged? Continue Reading
2Hadoop trends and issues-
Maximizing the potential of Hadoop 2: Opportunities and challenges
Hadoop 2 can support applications in a wider range of programming modes and data-crunching capacities. In addition, the Hadoop framework is being tapped for involvement in areas such as mainframe modernization and mobile app development. In this section, learn about new trends in the use of Hadoop and hurdles that could get in the way of the technology -- and Hadoop users.
Hadoop 2 extends the framework's capabilities beyond batch processing and serving as a big data landing pad, but there are new issues to consider as well. Continue Reading
Hadoop 2's features include new development options that expand the potential for innovation in building big data applications. Continue Reading
Many organizations in the U.K. and Europe may not be ready to take full advantage of the broader data management capabilities that Hadoop 2 has to offer. Continue Reading
Learn what industry insiders and users had to say at Hadoop Summit 2013 about the increased flexibility in Hadoop 2 and potential new enterprise uses for the technology. Continue Reading
Hadoop clusters are threatening the data warehousing status quo. Could they also make incursions into the world of mainframe modernization and migration? Continue Reading
Get tips on how to use existing Hadoop applications to help meet the data needs of mobile application users. Continue Reading
3Hadoop and business uses-
Weighing Hadoop 2's place in business analytics and operations
In this section, discover how Hadoop 2 supports business analytics and enterprise operations -- and get advice on what's needed to make the potential uses a reality in companies wanting to take advantage of its added functions. Consultants and experienced users discuss what Hadoop 2 has to offer and what challenges stand in the way of getting valuable business benefits from the upgraded Hadoop framework.
Consultant David Loshin explains how the separation of resource and application management in Hadoop 2 boosts its ability to support data analytics. Continue Reading
While organizations can now deploy Hadoop 2, many don't have employees in-house with the skills needed to support the platform. Continue Reading
While Hadoop vendors are touting its ability to function as a real-time analytics tool, there are still questions about whether it's really suited for real-time uses. Continue Reading
Strata Conference panelists discuss Hadoop best practices for enterprise deployments and the ongoing evolution of Hadoop from proof of concept to operations workhorse. Continue Reading
Learn how satellite operator Skybox Imaging is now able to find new meaning in satellite data with analytics help from Hadoop. Continue Reading
Brush up on your Hadoop 2 vocabulary
Terminology abounds in the world of Hadoop, particularly because there are so many add-on tools in the Hadoop ecosystem. In this section, you'll find definitions of various terms associated with big data and Hadoop 2.
How well do you know the Hadoop ecosystem?
Take this brief quiz to test what you know about the Hadoop framework.Take This Quiz