Joshua Resnick - Fotolia

Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Hadoop and Spark are coming of age together

Hadoop and Spark are open source big data management technologies that both found interest at this year's Strata + Hadoop World conference in New York, as reported in this Talking Data podcast.

While there is plenty of Hadoop platform chatter, there is also notable buzz around Spark, an emerging analytical data framework.

As open source prodigies, Hadoop and Spark still seek better definition and greater maturity. The 2014 Strata + Hadoop World conference in New York provided a window on both works in progress, according to TechTarget editors who took part in this edition of the Talking Data podcast.

The reporters suggest that understanding the two technologies' roles -- sometimes exclusive, sometimes inclusive -- is useful for the data architect who pursues big data architectures based on commodity computing clusters.

A strong use case for Spark appears to be machine learning, a highly iterative process in which specialized algorithms repeatedly churn through masses of data, TechTarget writer Jack Vaughan told colleague Ed Burns. Spark's in-memory approach is a step forward in terms of performance, but it may have cost ramifications in some cases, according to Vaughan's sources.

Also discussed were a number of Spark-related product announcements made at Strata + Hadoop World, which mark it as a technology to watch going forward.

The Strata event also showcased enterprise user presentations describing growing experiences with Hadoop and Spark.

"Whether it is Spark or Hadoop, there is interest in these new open source technologies and the idea that a somewhat general programming approach can be applied to them for handling massive amounts of data, and for doing it on commodity clusters," Vaughan said. "Spark and Hadoop may be more related than some people think."

In fact, today, Spark is often deployed as a component running on the Hadoop 2.0 platform. In this sense, Hadoop plays a real role in enabling Spark, as it has taken some of the spotlight on the analytics stage.

Jack Vaughan is SearchDataManagement's news and site editor. Email him at, and follow us on Twitter: @sDataManagement.

Dig Deeper on Hadoop framework

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.