Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Machine learning gets boost from growing big data ecosystem

The parallel processing platforms underpinning big data systems can help machine learning newbies and vets alike. TechTarget editors discuss how that may play out in this Talking Data podcast.

New parallel processing platforms in the growing big data ecosystem are enabling organizations to bring greater compute power to bear on analytical problems. And machine learning applications are likely to be among the leading uses for systems based on big data technologies such as Hadoop and Spark.

The combination of parallelism and machine learning interests data scientists at companies like Allstate Insurance, Cisco Systems and Pandora Media. They want to build complex machine learning models and run the models repeatedly to fine-tune the algorithms and improve the results -- and they want to do that work as quickly as possible so they can handle a greater number of analytical problems, according to presentations and discussions at the Strata + Hadoop World 2015 conference held recently in San Jose, Calif.

In this Talking Data podcast, SearchDataManagement's Jack Vaughan, who covered the conference, tells colleague Ed Burns that people are coming to machine-learning applications from a couple different points of view. One group includes data analysts and programmers at e-commerce websites who want to serve up recommendations to visitors. Another includes enterprise statisticians who have been immersed in the technology for years but haven't had the processing power needed to move beyond relatively simple models.

Vaughan says the latter group now faces the task of running more jobs that are based on sophisticated statistical models -- ones that can find patterns in data that can be turned into competitive business advantages. With the new parallel processing platforms, they have the compute power to scale up, handle greater amounts of data and, hopefully, be more successful in their predictive analytics efforts. At the same time, new tools in the big data ecosystem seek to streamline the machine-learning programs that have so far required programmers with specialized parallel computing skills.

The interest in machine learning shouldn't obscure the fact that SQL queries are also being applied to the growing pools of big data that organizations are accumulating. But Vaughan says much of the buzz at the Strata conference was around analytics applications of the machine learning kind.

Jack Vaughan is SearchDataManagement's news and site editor. Email him at jvaughan@techtarget.com, and follow us on Twitter: @sDataManagement.

Next Steps

Trace the expansion of the big data ecosystem for Google Cloud Platforms

Video: Expert reveals why building a big data ecosystem isn't a one-size-fits-all idea

Not many AI applications live up to the name

This was last published in March 2015

PRO+

Content

Find more PRO+ content and other member only offers, here.

Essential Guide

AWS analytics tools help make sense of big data

Join the conversation

2 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Which new tools in the big data ecosystem have you used?
Cancel
We've tested out quite a few tools - at the moment, though, we're focusing on Analytics tools like those from Visual.ly and Cirro. The basic idea is to figure out what we're going to measure (and how we're going to measure it), then work on software and business plans that integrate proper analytics from the very beginning. For the future, we're looking at open source tools.
Cancel

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSQLServer

Close