At IBM's Information on Demand event in Las Vegas this week, the company showed an early-access preview of its BLU Acceleration for Cloud, new capabilities for data discovery in its InfoSphere Data Explorer, and speed improvements for its IBM PureData System for Hadoop. For people looking to move Hadoop projects out of skunk works and into production, it introduced an InfoSphere Data Privacy for Hadoop offering that lets users "anonymize" data in Hadoop and NoSQL systems.
But it is the overall view, not the bits and pieces, that marks Big Blue's big data suite, said Gartner Inc. analyst Merv Adrian, who attended the IBM event. In fact, he said Hadoop itself was somewhat downplayed during the course of the conference.
"IBM focused not on big data, but on the big picture. It's not about components in their narrative, but it is about end-to-end assembly, assurance and integration," he said. "The portfolio was the story, not the parts."
What stays in Vegas, or 'anonymization'
Nationwide Insurance Co. in Columbus, Ohio has introduced open source Hadoop data handling for some proof-of-concept projects. Tara Paider, Nationwide assistant vice president for IT architecture, is leading the effort, and she said experience with new systems that gather high volumes of telematic data from insured users' autos led Nationwide to try Hadoop.
Paider, who spoke at the IBM event, said data protection traits are necessary to move Hadoop to the next stage. Nationwide's proof-of-concept use cases didn't involve sensitive data, and according to Paider, its environments were "very locked down."
For more on IOD 2013
Catch a reporter's view on IOD 2013 in a video
Check out our podcast on data governance
Find out about IBM BLU additions
Masking data or tracking data access activity wasn't a particular issue during proof-of-concept, but if Nationwide is to move Hadoop to the next stage, that will change.
Data "anonymization" -- or masking -- and data activity monitoring tools, as supported by IBM's data privacy software, are examples of software targeted at such operations. Hadoop, a parallel framework for processing large volumes of multi-structured data on commodity clusters, has so far been used largely in cases where protecting individual records was not necessarily an issue.
"As we move into production, [data protection] is a very important thing," Paider said. It is one reason why Nationwide looked to IBM and its BigInsights Hadoop platform, according to Paider.
Portfolio as story, or parts greater than the sum of portfolio
IBM will emphasize strengths for managing big data in operations going forward, according to Steve Mills, senior vice president and group executive for IBM Software and Systems. That is a key for the company, because for many users, deploying big data into operations is the next step after several years of experimentation and prototyping with big data technologies like Hadoop and NoSQL. Different sets of tools will be required to make the transition.
"Big data has been a catalyst for exploration of what the art of the possible is," Mills said.
"Our strategic intent has been to build out an ever-growing portfolio of capability," he said. "A lot of capabilities are required depending on the nature of the problem or the challenge."
Meanwhile, Gartner's Adrian said big data's uses go beyond just analytics. So many software tools will eventually come into play here. Predictive modeling, in-stream processing of operations' events, geographic mapping, mobile delivery and collaborative workflow, and other software types can capitalize on infusions of more data of all kinds, he said. And yes, IBM has offerings in each of these areas.