For most organizations, a major investment in “big data” management tools and technologies would be a bad idea right now, according to Thomas Redman, the founder of Rumson, N.J.-based Navesink Consulting Group.
Speaking during a recent Enterprise Efficiency
What is ‘big data?’
“Big data” is used to describe the voluminous amount of structured, unstructured and semi-structured data a company creates -- data that in many cases would take too much time and cost too much money to load into a conventional relational database for analysis.
For more details, read the Whatis.com definition of big data.
“The hard reality is that for the vast majority of organizations, this is not the time to invest in big data,” Redman said. “Most organizations readily admit that they are data rich and information poor, and [if you] don’t do small data well, how are you going to do big data well? Under those circumstances investing in big data is just a waste of time and money.”
Redman’s enthusiastic yet highly cautious approach to big data contrasts starkly with hardware and software vendor marketing efforts conveying the message that big data investment is necessary to remain competitive. A representative from one of those vendors, Intel Corp., joined Redman during the webcast.
“When it comes to big data, at Intel, we think of the word value,” said Tony Hamilton, an enterprise marketing manager for Intel’s data center and connected systems groups. “I’m among many folks that look at the overall big data solution train as a way in which corporate Web and sensor information can be overlaid as chosen to provide more confidence in decisions within the business, social and ecological realms.”
Others in the IT community -- like Ram Chandrasekar, director of the analytics center of excellence and strategy at pharmaceutical giant Bristol-Myers Squibb -- said that the decision to invest in a big data management infrastructure depends largely on the needs of the individual company and the problems it is trying to solve.
“You have to start with the business problems and put them within the context of the industry,” he said, “because there is a tendency to just chase the next big thing without really understanding its applicability to your problems.”
Rather than simply jumping on the big data bandwagon, organizations should take some time to look at the problems they are trying to solve and think carefully about how -- and if -- a big data investment can help solve those problems, Chandrasekar explained.
“Find out what the top three or four problems are and how they map to the data that is within the company, the data outside the company and the need to integrate them,” he said.
Getting started can be affordable
The hardware and skillsets needed to deploy a major Hadoop-based distributed cluster -- or any other type of big data management infrastructure -- may be currently out of reach for many companies. But it’s important to remember that it doesn’t cost much to experiment with big data, said Yves de Montcheuil, the vice president of marketing at Talend, an open source data management software vendor.
Big data software technologies, such as Hadoop, Hive, Pig, are all open source, which means that organizations have the option to start small, experiment and carefully determine if they are gaining value before making a larger investment.
When the time does come to make a large investment in a big data infrastructure, organizations may find it difficult to find people with the right skills, de Montcheuil cautioned. Those skills may have to be developed within the ranks of in-house personnel.
“By using big data, companies will be able to derive value at a far lower cost than by using traditional technologies,” he said.
For more on big data management tools
Find out how big data management is changing the face of data warehousing
Learn more about big data management technology
Blazing a trail to big data management
Companies mulling an investment in a big data management and analysis infrastructure should begin by taking steps to improve the way they handle more traditional forms of information, according to Redman. The first step is to get the data quality house in order.
“Almost all organizations need enormous quality improvement,” Redman said. “It is hard work to do because you have to find and eliminate the root causes of error [and get] deeply into the processes that create data. The good news is that the benefits are fast and immediate in terms of lower cost and improved decisions and so forth.”
The next step is to build up a strong organizational capability around data management. That might involve finding people with the right skills, but perhaps more importantly, it means developing a cadre of managers who believe in making decisions based on facts, as opposed to gut feelings, Redman said.
The third step is to make sure that data processing technologies are up to snuff, and that may require an investment in data discovery tools, which are used to analyze disparate data sources in search of potentially valuable business patterns.
“You need to discover something novel in [data] if you’re going to get anywhere, but making money from that discovery is the most demanding part,” Redman said. “Organizations need to build up the data and the organizational and the process capabilities so that when they’re ready they can reap the benefits that big data offers.”