Early adopters of Apache Hadoop, including high-profile users such as Yahoo, Facebook and Google, had to rely on the partnership of the Hadoop Distributed File System (HDFS) and the MapReduce programming and resource management environment. Together, those technologies enabled users to process, manage and store large amounts of structured, unstructured and semi-structured data in Hadoop clusters.
But there were limitations inherent in the Hadoop-MapReduce pairing. For example, Yahoo and other users have cited issues with the first generation of Hadoop technology not being able to keep pace with the deluge of information they're collecting online because of MapReduce's batch processing format.
Hadoop 2, an upgrade released by the Apache Software Foundation in October 2013, offers performance improvements that can benefit related technologies in the Hadoop ecosystem, including the HBase database and Hive data warehouse. But the most notable addition in Hadoop 2 -- which originally was referred to as Hadoop 2.0 -- is YARN, a new component that takes over MapReduce's resource management and job scheduling duties. YARN (short for Yet Another Resource Negotiator) enables users to deploy Hadoop systems without MapReduce. Running MapReduce applications is still an option, but other kinds of programs can now be run natively as well -- for example, real-time querying and streaming data applications. The enhanced flexibility opens the door to broader uses for big data and Hadoop 2 implementations; in addition, YARN allows users to consolidate multiple Hadoop clusters into one system to lower costs and streamline management tasks. The upgrades in Hadoop 2 also boost cluster availability and scalability, two other issues that held back the first version of Hadoop.
Even with the added capabilities, Hadoop 2 still has a long way to go in moving beyond the early adopter stage, particularly in mainstream IT shops. But the new version heralds a maturing technology and a revamped concept for developing and implementing big data applications. This guide explores the features of Hadoop 2 and potential new uses for Hadoop tools and systems with insight and advice from experienced Hadoop users as well as industry analysts and consultants.
1Maximizing the potential of Hadoop 2: Opportunities and challenges
Hadoop 2 can support applications in a wider range of programming modes and data-crunching capacities. In addition, the Hadoop framework is being tapped for involvement in areas such as mainframe modernization and mobile app development. In this section, learn about new trends in the use of Hadoop and hurdles that could get in the way of the technology -- and Hadoop users.
Hadoop 2 gives U.K. companies new options, but usage lags U.S. levels
Many organizations in the U.K. and Europe may not be ready to take full advantage of the broader data management capabilities that Hadoop 2 has to offer. Read Now
Integrating Hadoop for mobile application development
Get tips on how to use existing Hadoop applications to help meet the data needs of mobile application users. Read Now
2Weighing Hadoop 2's place in business analytics and operations
In this section, discover how Hadoop 2 supports business analytics and enterprise operations -- and get advice on what's needed to make the potential uses a reality in companies wanting to take advantage of its added functions. Consultants and experienced users discuss what Hadoop 2 has to offer and what challenges stand in the way of getting valuable business benefits from the upgraded Hadoop framework.
Can Hadoop keep up with real-time business analytics needs?
While Hadoop vendors are touting its ability to function as a real-time analytics tool, there are still questions about whether it's really suited for real-time uses. Read Now
Charting new frontiers in geospatial data with Hadoop
Learn how satellite operator Skybox Imaging is now able to find new meaning in satellite data with analytics help from Hadoop. Read Now
3How well do you know the Hadoop ecosystem?
Take this brief quiz to test what you know about the Hadoop framework.