Structuring a big data strategy
A comprehensive collection of articles, videos and more, hand-picked by our editors
While new to many people, big data technologies like Hadoop, NoSQL and in-memory analytic engines have gained greater use and some maturity of late, according to Jeff Kelly, lead big data analyst at Wikibon. Just how big will big data be? Wikibon sized the big data market at $11.4 billion in 2012, and expects it to exceed $47 billion by 2017.
SearchDataManagement talked with Kelly recently about Wikibon's forecast and the thinking behind it. The conversation concerned the spread of big data, its uses, the changing vendor landscape and how the big data poster child known as "NoSQL" is looking a bit more like SQL.
When you look at the big data market, you spread it out beyond just Hadoop. Would you articulate some of the elements that you looked at in your research?
Big data is partially technology, but it is also a mind-set.
lead big data analyst, Wikibon
Jeff Kelly: We thought long and hard about this. Certainly Hadoop is a part of it. It is important, but only one part of it. Big data touches pretty much every aspect of the data management stack. Ultimately, we decided the most useful way to structure our market sizing was to focus on those technologies that were needed because traditional data management technologies just couldn't process the new volume, type and velocity of data. Hadoop and NoSQL make sense there.
We also decided to include the hardware that supports those types of platforms, as well as related data management software that isn't necessarily brand-new but is being applied to big data workloads. You need, for example, to use data integration software to get data into Hadoop. There is no single unified big data platform yet.
We don't count traditional [business intelligence] reporting software, but we do count visualization software, which may predate big data but which people are now applying to interesting data sources that they crunched with Hadoop and some other big data platforms. Big data is partially technology, but it is also a mind-set. People are using big data and even some non-big-data tools to really explore data in new ways.
What are the use cases we are seeing for big data? How did we get here?
Kelly: Well, it started out with the Web companies. Google kind of invented what we know today as big data. Then Yahoo took off with it and created Hadoop. Companies in the Web space got it started, but we see it now moving into the 'bio-pharma' field with genome-type research, for example.
Financial services companies are some of the early adopters as well, working, for example, with Hadoop to try and gain any advantage they can over competitors making trades. A little more slowly, we are seeing retailers picking up on this trend as well. But advertising and Web marketing are more than a niche. People may think this is hyperbole, but I really can't think of an industry that big data is not going to impact at some point.
From your point of view, what does the big data market landscape look like for the big vendors such as IBM and Oracle?
Kelly: The big vendors, while they are not making huge sums, are really starting to invest in this, because they understand this is where their customers are going. They are just starting to ramp up, but IBM has been on this trend for a while. They have been buying analytic startups for a long time. They have been focused on their Smarter Planet initiative, which is really all about big data and making better use of data. IBM is well ahead of the competition in terms of the revenue. [Wikibon estimates Big Blue garnered more than $1 billion from big data applications last year.]
For more on big data
Learn about Hadoop's role in data warehousing
Check out reports from Strata 2013 on big data
Catch up with "big data in motion"
Oracle is using Cloudera in their big data appliance. They have their own NoSQL database. And they have the whole Exadata line. We debated a lot inside about whether we should consider that big data. We decided to give them a slice of that revenue related to big data, based on the types of workloads we were seeing.
But Oracle is one of the vendors most at risk here. The whole 'NoSQL-slash-Hadoop' model is about scale-out, clusters and commodity hardware. And Oracle is all about scale-up and proprietary hardware -- a really expensive big box. I am really interested to see how they will respond. So far, they have made token efforts -- that's what I would call them. Oracle is well entrenched, but I do see their model conflicting with the open source commodity model that we really consider as big data.
Doesn't it seem that the NoSQL, NewSQL and Hadoop people are moving as quickly as they can to 'look like SQL?'
Kelly: The big trend among Hadoop vendors right now is trying to apply SQL capabilities to Hadoop -- bringing 'SQL' to 'NoSQL.' People understand the benefits of NoSQL to scale out applications to deal with unstructured data, but if you want to bring this to the enterprise, you are going to have to meet a threshold of enterprise-level uptime, security and more. We are moving to that phase now where NoSQL will grow up. We are starting to see that happen.
Follow SearchDataManagement on Twitter: @sDataManagement.