Are traditional relational database vendors like Oracle and IBM too steeped in old ways to provide real value in...
2012 and beyond? Michael Stonebraker thinks so -- and he’s backing up that sentiment with cold, hard cash.
A longtime staple of the database community, Stonebraker is a computer scientist perhaps best known as one of the creators of the Ingres relational database. He has since gone on to help found and invest in data management companies like Illustra, Cohera and Vertica and says his newest venture – high-performance database provider VoltDB, where Stonebraker is co-founder and CTO – is in part a response to several trends shaping the world of data management.
SearchDataManagement.com got on the phone with Stonebraker to learn more about those trends and get his predictions for the data management market in 2012 and beyond. Here are some excerpts from that conversation:
What do you think are some of the key trends shaping the world of data management in 2012?
Michael Stonebraker: The first one is one size no longer fits all, and that is going to give heartburn to the current [relational database] elephants. The second one is “big data” means three different things and you’ve got to remember which one you’re talking about. The third one is that ACID [atomicity, consistency, isolation, durability] is a really good idea, so don’t throw the baby out with the bathwater. The fourth thing is, Think memory. It’s the new disk. The last thing is that -- in stages -- the cloud is really the answer to save money.
Let’s delve into each of these topics a bit further. What do you mean by “one size no longer fits all” and why should relational database vendors be concerned?
Stonebraker: The general idea is that if you go talk to your Oracle salesman and say that you have a database problem, he’ll say that [Oracle 11g] or [Oracle 9i] is the answer. The major relational vendors are selling a single solution. They have a hammer and therefore everything looks like a nail. I think that is simply not going to hold true in the future. In every vertical market I can think of, there is a way to beat the major legacy RDBMS [relational database management system] vendors’ code by [up to] two orders of magnitude. How to do that varies from market to market. But I see that there will be perhaps half a dozen database architectures off into the future that are specialized for specific vertical markets and that the one-size-fits-all vendors are just not going to be competitive in that world.
What are the three meanings of “big data” and what advice do you have for organizations that want to gain insight from enormous data stores?
Stonebraker: There is a huge amount of yak these days about big data, and big data means lots of different things to different people. To me the best way to think about big data is to talk about it in terms of three Vs. Big data could mean big volume. You’ve got terabytes going on petabytes of data. It could mean that you drink from a fire hose and are getting too much velocity thrown at you. And the third one is that you’ve got too much variety of data sources. You’re trying to integrate one or two thousand separate data sources and you’re dying trying to do data integration. [The point is that] big data means three different things and any given vendor is in only one of these three worlds. It’s very helpful to think about which one they’re in so you can ask them the right kinds of questions.
What are your thoughts on ACID?
Stonebraker: I’m a huge fan of ACID and I think ACID will endure a long time in spite of what the NoSQL guys are saying. There is no question that in the legacy relational database systems, ACID is a costly thing to provide and so the NoSQL guys say, “Well, we’ll go fast by getting rid of it.” My feeling is that there are a huge number of problems which require ACID. My point of view and Volt’s point of view is that you don’t have to give up ACID to go fast. What you need to do is engineer a completely different database system from what the elephants have done a long time ago. ACID is good. Don’t throw out the baby with the bathwater. Throw out the bathwater.
Why do you believe that memory is the new disk?
Stonebraker: What this basically means is that -- outside of the big-volume market where people want terabytes going on petabytes -- in most database applications you want to start thinking main memory just because the plunging cost of main memory makes that very practical. You might also ask if [flash memory] is going to make it. I think that if you have something that almost fits in main memory, but doesn’t quite, you might look at flash as another technology. You could say that flash is the new disk in certain worlds. But the point is that you don’t have to universally say that disk is the answer.
Why do you think organizations should consider building their own private clouds in 2012?
Stonebraker: [Organizations today] tend to have a lot of silos running at fairly low utilization. It makes absolutely perfect sense right now for any such enterprise to buy a private cloud -- meaning a collection of machines in a big server farm inside the firewall -- and put virtualization on that server farm and have your data silos share that infrastructure. As long as you’re inside the firewall, all of the enterprise objections to running on the public cloud go away. There is no issue on the data being somewhere else. [But sooner or later] prices are going to reflect costs and sooner or later any enterprise is going to be able to save a boatload of money by moving to a public cloud. Sooner or later we’re all going to compute on big public clouds to the extent that we are legally able to do that. So I see us moving from data silos to clouds inside the firewall and over time to public clouds.