Assuming that plans turn into actual deployments, big data environments are beginning to pick up momentum toward achieving mainstream status, according to the results of TechTarget Inc.'s 2013 Analytics & Data Warehousing Reader Challenges & Priorities Survey. In the survey, conducted earlier this year, only 14% of 540 respondents said their organizations had big data management and analytics programs in place. But another 27% said their companies planned to launch big data projects within the next 12 months, while 38% said they were considering such initiatives.
You can't know what tool to use until you know what you're going to use it on.
Bruce Perrin, chief operating officer and acting CIO, Phenix Energy Group
Despite the amount of attention being devoted to Hadoop, NoSQL databases and other big data technologies, actual use of them has been in the single or low double digits on a percentage basis thus far, with large Internet companies accounting for many of the deployments. But market watchers also see big data adoption rates starting to accelerate on a broader basis.
"I think big data has certainly crossed the chasm of kind of going from market-speak and market buzz to being viable and a critical part of companies' strategies for analytics and operational workloads," said Shawn Rogers, a business intelligence and data warehousing analyst at Enterprise Management Associates Inc. in Boulder, Colo. A survey done jointly by EMA and 9sight Consulting in 2013 found that the industries making the most use of big data include financial services, manufacturing, healthcare and retail, Rogers said.
One company embarking on a big data project is Phenix Energy Group, a Palm Harbor, Fla.-based company that is building a set of pipelines, tank farms and deep water ports to transport crude oil products from the Atlantic coast of Central America to the Pacific side. The initiative is a long-term proposition, though: It likely will take about two years to fully build out the installation and even longer to start seeing tangible business benefits from the big data program, said Bruce Perrin, chief operating officer and acting chief information officer at Phenix.
"When we get far enough down the road that we have sufficient data to do it, we're going to be looking at operational information over time, and we probably need anywhere from three to five years' worth of accumulated data before we have anything that we can meaningfully extract information from," Perrin said.
The company plans to collect data from about 30,000 sensors along the underwater and terrestrial pipelines; the information will be used to improve operational efficiency and revise business processes. Transmission delays of any kind -- such as equipment maintenance and the time it takes to switch ships at terminals or change work crews -- could cost Phenix as much as $1 million an hour, Perrin said. And since the pipeline system will have a finite capacity, reducing downtime is the only way to increase throughput. "We're like a railroad: You can only pull so many cars with the train," he explained. "So it's how well we pack the train and how quickly we get it out of the station on both ends that determines how effective we are."
Working with big data
According to the TechTarget survey, structured transaction data is far and away the most prevalent information being used in big data applications. When asked what types of data their organizations collect or plan to collect as part of their big data programs, 82% of respondents cited transaction data. That was nearly twice as much as the second most-cited data type: customer emails, letters and survey responses, at 43%. Other types of data that were selected by more than one-third of respondents included social media activity data, Internet clickstream records, and computer and network log files.
Going forward, Rogers expects to see a shift toward more multistructured data sets, including real-time streams of information in addition to the data types in the second tier of the survey responses. "That's the kind of information we always wished we could analyze," he said. "The advent of big data allows us to address these things in a much more economical fashion." And, he added, Hadoop clusters and related technologies can handle processing workloads on such data "that frankly were just impractical to do earlier."
More on managing big data projects
Check out our guide on developing a successful big data strategy
Learn why more than technology is needed to reap the benefits of big data
Find out if in-memory analytics should be part of your big data architecture
Nonetheless, the survey somewhat surprisingly showed that mainstream relational databases and data warehouses were the most common technologies supporting big data environments. They were cited by 55% of the respondents with active or planned big data projects, followed by analytical databases at 52% and data warehouse appliances at 46%. Hadoop systems and NoSQL databases were lower on the list, coming in at 41% and 21%, respectively.
Pamela Arya, CEO of big data software vendor Optensity Inc., said she thinks most companies will utilize a mix of technologies. "You should use the right technology and not assume that one solution fits all," she said. Rogers agreed, noting that almost 60% of the user organizations EMA has surveyed were using two or three different platforms to support their big data strategies.
Perrin said that of the technologies he has looked at thus far, Hadoop seems to be the most logical choice for Phenix. One reason he gave was the fact that it's an open source environment, instead of a proprietary platform that would "saddle us with [a vendor's] bureaucracy, their infrastructure and their egregious costs."
Big data projects need business goals
In the TechTarget survey, gaining competitive advantages over business rivals and improving organizational efficiency and profitability tied as the most-cited goals of big data deployments; both were chosen by 27% of respondents. Whatever the primary goal is, analysts cautioned that companies shouldn't jump into a big data initiative without first identifying a specific business problem or need that the effort could help solve.
"The right question to ask is from the business side: 'What kind of business problem are we having?' To look for technology in search of a business problem is absolutely the wrong approach," said Boris Evelson, an analyst at Forrester Research Inc. in Cambridge, Mass.
Similarly, Perrin emphasized that Phenix took the time up front to really focus on what it's trying to accomplish with its big data program. That process "has helped clarify, at least to a certain degree, what tools are necessary to get it accomplished," he said. "I would say it probably relates perfectly to the old adage, 'When all you have is a hammer, everything begins to look like a nail.' You can't know what tool to use until you know what you're going to use it on."
And for companies just getting started on big data projects, Rogers advised getting experience with a targeted deployment first. "Start small, start simple," he said. "Be successful with your first pilot project and then go on that."