Big data systems shine light on neglected 'dark data'

The processing power of Hadoop and other big data tools is making it more feasible for companies to tap into dark data, information that previously was left untouched in IT systems.

This article can also be found in the Premium Editorial Download: Business Information: Big data technology: Beyond the trendy tools:

The notion of "dark data" lurking in the shadows of IT systems has been around for years. But with the increasing adoption of Hadoop and other highly scalable big data technologies, more of that data is poised to come out into the open.

Consulting company Gartner Inc. marks dark data as "information assets that organizations collect, process and store in the course of their regular business activity, but generally fail to use for other purposes." Now, the ability of Hadoop clusters and NoSQL databases to process large volumes of data makes it more feasible to incorporate such long-neglected information into big data analytics applications -- and unlock its business value.

As a result, archived data that was "just lying around" has become a potential goldmine for organizations, not simply an untapped pool of information they were obliged to keep for regulatory compliance purposes, said Aashish Chandra, divisional vice president of application modernization at Sears Holdings Corp. in Hoffman Estates, Ill.

"This is a different world we're living in," said Chandra, who is also general manager of the big data and legacy systems modernization business in Sears' MetaScale LLC professional services unit. "People were using backup tapes for archiving. Now you can put that data in Hadoop and query the data in real time."

In the past, some data was left dark because it was too old to be useful by the time it was made available to business users for analysis. A Hadoop-based data warehouse put into production in February by Edmunds.com Inc. has accelerated that process and opened up new views of data that are helping the company reduce operating costs, said Paddy Hannon, vice president of architecture at the online publisher of car-shopping information in Santa Monica, Calif.

"We've had some 'Eureka' data moments," Hannon said. For example, the new system lets the workers who manage keyword acquisition for the company's paid-search and online advertising efforts quickly probe incoming data to assess how changes in buying tactics will affect marketing initiatives. "That saved a significant amount of money," Hannon said -- more than $1.7 million as of mid-June, according to a blog post by Philip Potloff, chief information officer at Edmunds.

Gartner analyst Merv Adrian told attendees at the Hadoop Summit 2013, held in June in San Jose, Calif., that he expects more and more companies to begin auditing their archives of data to identify dark bits and try to map them to possible business uses.

"Much of what we're doing with big data is restoration of context," Adrian said. Expanding on Gartner's definition, he described data darkness as a state where "you know the transaction happened, but you don't know what went on around it" -- something that needs to be illuminated in order to turn dark data into business gold.

This was first published in August 2013

Dig deeper on Data mining and business intelligence

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSOA

SearchSQLServer

Close