This content is part of the Essential Guide: AI in IT tools promises better, faster, stronger ops

Systems infrastructure for deep learning software in flux

New approaches to hardware infrastructure are being put forward to address the challenges of running deep learning algorithms and other heavy-duty artificial intelligence applications.

Deep learning appeared after a long gestation, or all of a sudden. You can take your pick, depending on where you were when mainstream media discovered a collection of statistical and artificial intelligence techniques that seemed to promise a new era of automated predictive analytics.

The vote here is for a long gestation, although it's fair to say there is some suddenness about the way deep learning software is pushing a new class of analytics in which applications repeatedly churn through large sets of data, learning to predict likely outcomes as they go.

A lengthy birthing process seems in play because, really, newfangled deep learning is an updated take on the machine learning process, which in turn was a new take on neural networks, an early form of artificial intelligence in which simulations mimic the human brain's neuron activity by weighting outputs and building connected sets of meaning.

What marks deep learning software is use of multiple processing layers. These layers create a hierarchy in which higher layers "learn" progressively more abstract concepts from lower-level ones.

In the process, new combinations of algorithms are used, which have been shown to be successful in identifying patterns in images, speech, text and more. Most of these data types have heretofore been on the periphery of corporate computing -- but that is changing.

Learning terms used interchangeably

Distinctions between machine learning and deep learning aren't always that important, even to some people who are pretty steeped in AI. And the terms are often used somewhat interchangeably, as we discovered recently at the 2016 Deep Learning Summit in Boston.

"The terms are so mixed that I wonder if it's worthwhile to make a distinction," said Alejandro Jaimes, chief scientist at software vendor AiCure and a conference organizer. "The practical reality is that deep learning is a family of algorithms that has been found to work on a set of problems."

A factor propelling deep learning software these days is open source technology, Jaimes said. Sets of deep learning algorithms that have been found to work in one area, such as image recognition, are becoming models that are shared within the AI community. The models, in turn, are tested and tweaked on other tasks.

AI researchers from some of the most visible exemplars of massive machine learning today -- including Amazon, Netflix, Twitter and Facebook -- discussed experimental strategies at the conference. The new technology helps these companies discern activity in the massive amounts of data their customers and users generate.

Machine learning has been around at Facebook since almost the beginning, according to Andrew Tulloch, a research engineer at the company who spoke at the summit. Areas of work involved include language technology, ranking, and image and video understanding, he said.

Infrastructure issues for deep learning

By most any standard there is some heavy lifting involved. Tulloch said Facebook serves ads for 2 million advertisers a day; trillions of ads are ranked daily, and the company does 414 million daily language translations.

That's impressive -- but we have not entered the era of the free lunch. Doing machine-learning-backed calculations on such large volumes at such a rapid rate creates a huge infrastructure challenge, Tulloch acknowledged.

A lot of the challenges boil down to making things faster and making things scale.
Andrew Tullochresearch engineer at Facebook

"A lot of the challenges boil down to making things faster and making things scale," he said. What Tulloch is seeing is faster algorithms and deeper layers, but these are taxing the capabilities of hardware clusters.

In fact, at the chip level, for machine learning, there is plenty of ferment. Commodity CPUs have been the life blood of recent big data advances, but they may not be the best option for the many layers of learning some companies are pursuing.

Like others, Facebook has been working with graphics processing units (GPUs) to address machine learning needs. Others have championed specially designed field programmable gate arrays (FPGAs) for the job of powering deep learning software.

This month, Google disclosed that it had created full custom application-specific integrated circuits for its closely watched TensorFlow machine learning technology, which is available as open source software.

Google is calling its devices TPUs, for Tensor processing units, and they've already been used in the company's RankBrain search and AlphaGo game-playing programs.

For now, details are sparse, but the chip architecture seems to take a looser approach to mathematical precision than other processing technologies.

It's the opinion of some, including the Google TPU design team, that today's high-precision GPUs and CPUs add unnecessary processing overhead -- that reduced precision works well for deep or machine learning, thank you very much. By the way, Google doesn't seem too concerned about using machine learning where others would use deep learning.

Whether the path to deep learning applications is paved with CPUs, GPUs, FPGAs or TPUs, it does seem that big data's leveraging of commodity hardware may have reached an impasse that will have to be addressed by big data vendors and users alike. 

Next Steps

Read about the impact of deep learning

Learn more deeply from two deep learning giants

Find out just how Google open sourced TensorFlow

Dig Deeper on Big data management