In this Talking Data podcast, TechTarget editors discuss Hadoop's future, IBM's decision to resell the Hortonworks distribution of the open source technology and other big data issues.
Big data environments are becoming more commonplace in companies, including mainstream enterprises. As they do, though, big data vendors and users are developing different views of the staying power of the Hadoop stack, at least from a nomenclature standpoint.
That's one of the discussion topics in this episode of the Talking Data podcast. Craig Stedman, senior executive editor of SearchDataManagement and SearchBusinessAnalytics, covered DataWorks Summit 2017 in San Jose, Calif., a conference that until this year was known as the Hadoop Summit. The name change, and another in which the rival Strata + Hadoop World became the Strata Data Conference, reflects a changing attitude among vendors about tying themselves to Hadoop, Stedman says in the podcast.
Instead, both Hortonworks Inc. and Cloudera Inc. -- the vendors behind the DataWorks and Strata events, respectively -- now want to put the focus on big data and what users can do with it. That's because of growing uncertainty about whether what officially constitutes Hadoop will survive into the future.
MapReduce, Hadoop's original programming and processing platform, is increasingly being supplanted by the Apache Spark engine. And alternatives to the Hadoop Distributed File System (HDFS) have popped up, enabling organizations to build a big data architecture without either of Hadoop's core components.
On the other hand, users still tend to refer to big data systems as their Hadoop environment, as was the case in many presentations at the DataWorks Summit, Stedman says. For them, the Hadoop stack has come to encompass the entire big data ecosystem surrounding MapReduce and HDFS, as opposed to those two components alone.
The big news at the conference was IBM's agreement to drop its own Hadoop distribution in favor of reselling the Hortonworks Data Platform. The deal leaves just four Hadoop vendors in the market, but Stedman says that isn't necessarily a bad thing for users because there are some clear differences in development strategies between the remaining ones.
Listen to the podcast to hear more from the DataWorks Summit on the future of the Hadoop stack, the IBM-Hortonworks deal, machine learning applications in big data environments and other issues.
What some companies are doing to improve Hadoop processing and performance
Thinking of deploying Hadoop? Here's what to consider when choosing a distribution
How Hadoop architectures are changing to add containers and microservices