Sergey Nivens - Fotolia
As SQL-based relational databases became a mainstay in organizations over a period of several decades, general processes arose for tuning, monitoring and managing their performance, along with a variety of software tools that automate the work. But things are different with NoSQL databases.
NoSQL technology is still in its relative infancy, and the wide range of NoSQL software types and product offerings makes it challenging to implement broad performance management approaches. As a result, managing NoSQL performance is still an emerging art, and likely will remain so for some time to come.
"People have known the patterns for how to tune relational database systems for a long time," said Pramod Sadalage, a principal consultant at software development and advisory services provider ThoughtWorks. "But those kinds of patterns have not yet matured in NoSQL. What patterns there are, are very specific to the individual database."
That means the quest for solid NoSQL performance should start at the beginning of a project, according to Sadalage and other database experts. They said IT managers, architects and developers evaluating NoSQL options have to choose with care and select the right database for the particular job at hand.
NoSQL databases are often described as "fit for purpose," with different technologies best suited for specific kinds of applications. And there's plenty of ground to cover when trying to get the lay of the NoSQL land. The primary types of NoSQL systems include key-value stores, document databases, wide column (or column-family) stores and graph databases.
Some consensus is forming around which type to apply when. For example, wide column databases are a match for write-heavy applications where brief inconsistency in database replicas isn't a problem. Document databases can serve in web applications where JSON data structures are prominent and flexible updates are important. Key-value databases enable very fast data access via simple keys for uses such as caching data. Graph databases are a fit where relationships between data elements form networks that can be shown in graph form, such as in business process management or social networking. (For more on apt NoSQL use cases, read an excerpt from author Dan Sullivan's NoSQL book.)
Within each category, though, there are a variety of individual products and open source technologies that prospective users need to discern between. While a handful have broken away a bit from the throng on adoption -- MongoDB and Cassandra, for example -- there's no universal answer to NoSQL application needs.
Use NoSQL databases wisely
"People are used to a relational world where the database can be used for most anything. But you better be sure to go NoSQL for a specific use case and to make sure the NoSQL database you choose is designed to provide reasonable performance for that use case," said Craig Mullins, president and principal consultant at Mullins Consulting.
To get the kind of database performance they're looking for, IT teams "really need to come to understand how these things work" before deploying one, Mullins said. He pointed to the Cassandra wide-column database as an example of how application fit relates to processing performance in NoSQL environments.
"Cassandra is designed so a row, or record, holds everything," Mullins said. "That's a case where, in the relational world, a dozen tables might be used. But it all gets stored in one wide record in a Cassandra database." That's good for performance if you need to access all the data in a particular record, he noted, citing a customer credit rating application as a case in point. But he added that when business users require something else, such as a report on all customers, the one-record-holds-everything approach can cause performance to take a hit.
In addition, the application elements that come standard with SQL databases often need to be hand-coded in NoSQL systems. "One of the things about obtaining performance with NoSQL is that it's a lot more time-consuming than some people assume. A lot of custom code has to be written," said Rick Sherman, founder of consultancy Athena IT Solutions.
And in a lot of cases, that work is done by application developers who may not be versed in the inner workings of the database they're utilizing. "Developers aren't database designers," Mullins said. "They design based on their program needs, and not for general data uses."
NoSQL skills not standard issue
Even among database administrators (DBAs), the limited availability of NoSQL skills is an issue that data management teams face in trying to ensure that NoSQL performance is up to the application task. The richness of NoSQL technology options hasn't been immediately converted into a corresponding richness of resources, Sadalage said. "You can find an Oracle DBA with 15 years of experience pretty easily. That's not possible with Cassandra. The database itself is only about 8 years old."
The first push on NoSQL usage involved operational applications, but more attention is now being addressed at analytics uses, as well. And when you add in analytics and reporting requirements, gaining a speed edge with a NoSQL database becomes even more difficult, according to Mike Bowers, a principal enterprise information architect at The Church of Jesus Christ of Latter-day Saints.
Bowers agreed that the first step in a deployment -- studying the available options and learning what works where -- remains the most important one in enabling good NoSQL performance. "Most NoSQL databases are not true databases," he said. "They're data engines that allow developers to build their own custom database that's optimized specifically for one application."
Sadalage offered a more trenchant take on the current -- and likely ongoing -- situation. "Pick the right thing for the right job," he said. "Don't pick the wrong thing and then complain about it."
Look at relational databases versus NoSQL and other options
Find out how to gauge NoSQL database use case fits
See how IT pros approach NoSQL in the enterprise