Using big data and Hadoop 2: New version enables new applications
A comprehensive collection of articles, videos and more, hand-picked by our editors
Hadoop doesn't lack for attention from prospective users and industry analysts, not to mention those of us in the technology media. But all that attention has yet to translate into a high adoption rate. For example, in a survey conducted last year by consultancies Enterprise Management Associates Inc. and 9sight Consulting, only 16% of the 259 respondents said their organizations were using Hadoop. Other surveys have shown similar adoption rates in the low double digits. The open source distributed processing framework has some marquee users, primarily among the large Internet companies where the first Hadoop version originated -- but it hasn't reached very far beyond the marquee.
That's partly because a lot of companies are still trying to figure out exactly what to do with the technology. In a January 2014 blog post, Gartner Inc. analyst Merv Adrian cited the results of a poll question that asked attendees of a webinar held by the consulting company about the top barriers to Hadoop adoption. First on the list, chosen by just over 50% of the 213 respondents, was an "undefined value proposition."
Of course, big data applications are new territory for many organizations. But another factor helping to feed the lack of business cases for deploying Hadoop is the fact that its original incarnation was limited in what it could do. Only MapReduce applications were supported, and scalability was held down by the need to go through a single namespace server to locate where data files required by applications were stored across a Hadoop cluster. In another blog post, Adrian described talk about Hadoop taking on the role of an enterprise data hub as "aspirational marketing" by advocates. That point is "a long way off," he wrote.
A first step down the road was the October 2013 release of Hadoop 2, which undoes the dependency on MapReduce and makes Hadoop clusters more scalable and fault-tolerant. We've published a variety of content that explores the capabilities of the new Hadoop release to help IT managers, business executives and other technology decision makers decide if it's something they can work with. One story answers some FAQs about Hadoop 2 and its key features. Another examines the fine details of the YARN resource manager, the new component that opens up Hadoop to non-MapReduce applications. We also look at Hadoop 2 skills requirements and issues to consider in weighing Hadoop 2 deployments. And on our sister site SearchBusinessAnalytics, consultant David Loshin discusses the improved ability to do real-time analytics in Hadoop 2 -- something that YARN can also take credit for. Hadoop Version 2 is here -- now's a good time to see if it merits your attention.