Chip giant Intel Corporation has disclosed pilot projects based around its Trusted Analytics Platform (TAP). At the recent Strata + Hadoop World conference, the company said it has been working with Penn Medicine, and other healthcare organizations, to test out a cloud-based platform for creating big data analytics applications.
Intel's TAP program is intended to improve collaboration between the skilled data scientists who prototype big data analytics platform apps and the equally skilled developers who put those apps into production.
"Applications that run on big data platforms take a while to develop. They have to be created on things like Hadoop, the programming is in Java and they are sometimes highly customized," said Vin Sharma, director of strategy and product for the big data analytics efforts at Intel's Data Center Group.
The map for TAP
TAP looks to take a page out of the cloud architecture playbook, in which loosely coupled APIs and platform as a service (PaaS) are common. These loose services can be stood up separately and quickly composed into aggregated apps, he said.
Vin Sharmadirector of strategy and products, Intel
Analytics that use machine learning applications, especially, require a significant number of iterations on both sides of the data scientist-software engineer wall today, said Sharma.
"We see an impedance mismatch between the development of applications that run on cloud-scale infrastructure and the development of analytics running as big data applications," he said.
To address these issues, Intel has formulated TAP. It's a big data analytics platform that defines three layers:
- A data layer that can include Apache Hadoop, Spark and other components
- An analytics layer supporting data science software tools kits
- An application layer for run times on public clouds, such as Amazon Web Services, Rackspace and OVH.com
Open-source Cloud Foundry components are a part of TAP, as are a range of middleware connectors, Intel-originated (now open sourced) provisioning and orchestration components that help stitch applications together, according to Sharma.
Security features that exploit encryption hooks embedded in Intel silicon are part of TAP; thus, the presence of the term "trusted" in the TAP name.
Signals and noise
At Strata + Hadoop World, Michael Draugelis provided a user view on TAP. Draugelis, who is chief data scientist for the University of Pennsylvania's Penn Medicine operations in Philadelphia, described proof-of-concept work to create Penn Signals, an application that takes a variety of data and looks to predict the likelihood hospital patents will develop sepsis or heart failure.
Draugelis came to such work after his wife went into shock not long after giving birth to their son. Both are well now. But, Draugelis said, he wondered why there were not better available predictors for such reactions. Better predictors are the goal of the sepsis and heart failure research efforts.
He told a Strata + Hadoop World keynote crowd that his team of data scientists needed an environment that can be quickly built, analytic tools that can scale and a platform that can support development. TAP, he said, could "allow our data scientists to do just that."
Take a listen to this podcast interview with Sharma to find out more about the company's proposed open-source platform and how it might cope with mismatches that stymie fast development of big data analytics platform applications.
Discover how to manage successful Hadoop projects
Take our Apache Spark quiz