In-memory databases have a long history, but they've gained increasing prominence in recent years thanks to falling memory prices and the introduction of new technologies. SearchDataManagement spoke with data management and business intelligence consultant William McKnight to get his view of in-memory technology and processing methods as they relate to data in general and relational databases in particular.
McKnight, president of McKnight Consulting Group, said in-memory database software that stores data in main memory, instead of on disks, is geared toward applications where high performance is especially valued. Putting a database in memory can speed up response times for business users and enable deeper data analysis, he suggested. He spoke as well about how established relational databases are now being touched by the addition of in-memory processing capabilities. Excerpts from the interview follow.
What kinds of applications are a good fit for in-memory database technology?
William McKnight: I'd say it's your high-performance applications, and they can be in the operational area or they can be in the analytical area. These are applications that hit that sort of sweet spot of 2 terabytes [of data] to some other single-digit terabyte number.
Some people think that disk storage eventually will hold only the coldest of the coldest data. What do you think? Will memory absorb more and more of the data load?
McKnight: Yes, I agree with that. You get access to data that is tens of times, to hundreds, to thousands and to tens of thousands of times faster than [hard disk drives], and something like that as fast as [solid-state drives]. As costs are declining, memory in a lot of ways is becoming the new disk -- whereas HDD is becoming more or less the new tape. But spending more on the data layer can be a tough decision.
As with many technologies, Wall Street has been one of the industries leading the way in deploying in-memory databases. You've also pointed to in-memory applications at telcos and for fraud detection. What about other uses in what might be described as the "classic enterprise"?
The requirements lately haven't been of the vanilla variety that can be met by traditional methods.
McKnight Consulting Group
McKnight: Enterprises can use in-memory to speed response time from minutes down to seconds and improve data reliability in order to optimize their business, and in order to allow analysts to do deeper integrations in the limited time window they have to deal with the data. They can actually drill to deeper levels -- they can get closer to root cause analyses, and the like. When you can do that, you're hopefully producing a better business. Data is power -- as long as it can be utilized. But if the analysts have to slog through slow performance or unarchitected environments where they have to do the gluing together of data themselves, that's what they'll spend their time doing. It won't be on analysis.
So you have to have good architecture no matter what the storage platform is. But if that storage platform is in-memory, you may be able to get to a deeper level of [user] need. You can perform simulations that present business forecasts much more quickly. And if your reports are just running slow and you're closing the books later, or just too late for your liking, maybe in-memory systems can speed that up.
What things change for IT decision makers when in-memory technology comes into play?
McKnight: The difference would be a bigger budget for storage as well as different configurations in the server room -- if you still have one. And more importantly, on the user side of things, you have to improve your data science to take advantage of the better performance. It's important to note that it isn't good enough to say "my business isn't asking for that kind of performance." As IT, you need to be ahead of the business -- showing them the possibilities.
Oracle, Microsoft and IBM have announced in-memory options for their flagship relational databases. What do those features bring with them?
McKnight: They bring more performance with a faster storage layer. Still, the in-memory options are at a different point on the "functionality maturity curve" than their legacy brothers. With the mainstream relational databases, the database is developed, and the storage layer is pretty well developed. With the in-memory options, there are different levels of integration today between the database management systems and the storage layer.
The approaches vary. SAP HANA is all in memory, though, of course, it's backed by disk for failover. With Oracle and others, you can identify columns that will be pushed into separate partitions that will be redundant to the core database, but the columns will be in memory. For its part, IBM has BLU Acceleration functionality that can be added onto the core [DB2] database.
In-memory is also appearing in the so-called NewSQL databases. You can do an in-memory database that is scale-up and scale-out, or total scale-out as with NewSQL. That is for high-performance applications where you have to rapidly ingest and work on the data. The NewSQL databases are purpose-built for [fast] throughput.
It certainly seems like the growth of in-memory computing is part of bigger IT trends -- ones that see a host of new technologies becoming part of the data management tool chest.
McKnight: The requirements lately haven't been of the vanilla variety that can be met by traditional methods. With in-memory as with other data tools, you still want good design, good architecture, data quality, data governance and the like. But in-memory can cover up minor flaws in these things. It allows you to get good performance. If you measure it in cost of processing, instead of just cost of storage, you're going to come out ahead.
The thing I try to emphasize is that you have to project onto the application where it is going to go over time. That's the leadership responsibility I want to challenge everybody with.