How to select the best DBMS software: A buyer's guide
A collection of articles that takes you from defining technology needs to purchasing options
The database management system (DBMS) is the heart of today's operational and analytical business systems. Data is the lifeblood of the organization and the DBMS is the conduit by which data is stored, managed, secured and served to applications and users. But there are many different forms and types of DBMS products on the market, and each offers its own strengths and weaknesses.
Relational databases, or RDBMSes, became the norm in IT more than 30 years ago as low-cost servers became powerful enough to make them widely practical and relatively affordable. But some shortcomings became more apparent in the Web era and with the full computerization of business and much of daily life. Today, IT departments trying to process unstructured data or data sets with a highly variable structure may also want to consider NoSQL technologies. Applications that require high-speed transactions and rapid response rates, or that perform complex analytics on data in real time or near real time, can benefit from in-memory databases. And some IT departments will want to consider combining multiple database technologies for some processing needs.
The DBMS is central to modern applications, and choosing the proper database technology can affect the success or failure of your IT projects and systems. Today's database landscape can be complex and confusing, so it is important to understand the types and categories of DBMSes, along with when and why to use them. Let this document serve as your roadmap.
DBMS categories and models
Until relatively recently, the RDBMS was the only category of DBMS worth considering. But the big data trend has brought new types of worthy DBMS products that compete well with relational software for certain use cases. Additionally, an onslaught of new technologies and capabilities are being added to DBMS products of all types, further complicating the database landscape.
The RDBMS: However, the undisputed leader in terms of revenue and installed base continues to be the RDBMS. Based on the sound mathematics of set theory, relational databases provide data storage, access and protection with reasonable performance for most applications, whether operational or analytical in nature. For more than three decades, the primary operational DBMS has been relational, led by industry giants such as Oracle, Microsoft (SQL Server) and IBM (DB2). The RDBMS is adaptable to most use cases and reliable; it also has been bolstered by years of use in industry applications at Fortune 500 (and smaller) companies. Of course, such stability comes at a cost: RDBMS products are not cheap.
Support for ensuring transactional atomicity, consistency, isolation and durability -- collectively known as the ACID properties -- is a compelling feature of the RDBMS. ACID compliance guarantees that all transactions are completed correctly or that a database is returned to its previous state if a transaction fails to go through.
Given the robust nature of the RDBMS, why are other types of database systems gaining popularity? Web-scale data processing and big data requirements challenge the capabilities of the RDBMS. Although RDBMSes can be used in these realms, DBMS offerings with more flexible schemas, less rigid consistency models and reduced processing overhead can be advantageous in a rapidly changing and dynamic environment. Enter the NoSQL DBMS.
The NoSQL DBMS: Where the RDBMS requires a rigidly defined schema, a NoSQL database permits a flexible schema, in which every data element need not exist for every entity. For loosely defined data structures that may also evolve over time, a NoSQL DBMS can be a more practical solution.
Another difference between NoSQL and relational DBMSes is how data consistency is provided. The RDBMS can ensure the data it stores is always consistent. Most NoSQL DBMS products offer a more relaxed, eventually consistent approach (though some provide varying consistency models that can enable full ACID support). To be fair, most RDBMS products also offer varying levels of locking, consistency and isolation that can be used to implement eventual consistency, and many NoSQL DBMS products are adding options to support full ACID compliance.
So NoSQL addresses some of the problems encountered by RDBMS technologies, making it simpler to work with large amounts of sparse data. Data is considered to be sparse when not every element is populated and there is a lot of "empty space" between actual values. For example, think of a matrix with many zeroes and only a few actual values.
But while certain types of data and use cases can benefit from the NoSQL approach, using NoSQL databases can come at the price of eliminating transactional integrity, flexible indexing and ease of querying. Further complicating the issue is that NoSQL is not a specific type of DBMS, but a broad descriptor of four primary categories of different DBMS offerings:
- Wide column store
Each of these types of NoSQL DBMS uses a different data model with different strengths, weaknesses and use cases to consider. A thorough evaluation of NoSQL DBMS technology requires more in-depth knowledge of each NoSQL category, along with the data and application needs that must be supported by the DBMS.
The in-memory DBMS: One last major category of DBMS to consider is the in-memory DBMS (IMDBMS), sometimes referred to as a main memory DBMS. An IMDBMS relies mostly on memory to store data, as opposed to disk-based storage.
The primary use case for the IMDBMS is to improve performance. Because the data is maintained in memory, as opposed to on a disk storage device, I/O latency is greatly reduced. Mechanical disk movement, seek time and transfer to a buffer can be eliminated because the data is immediately accessible in memory.
An IMDBMS can also be optimized to access data in memory, as opposed to a traditional DBMS that is optimized to access data from disk. IMDBMS products can reduce overhead because the internal algorithms usually are simpler, with fewer CPU instructions.
A growing category of DBMS is the multi-model DBMS, which supports more than one type of storage engine. Many NoSQL offerings support more than one data model -- for example, document and key-value. RDBMS products are evolving to support NoSQL capabilities, such as adding a column store engine to their relational core.
Other DBMS categories exist, but are not as prevalent as relational, NoSQL and in-memory:
- XML DBMSes are architected to support XML data, similar to NoSQL document stores. However, most RDBMS products today provide XML support.
- A columnar database is a SQL database system popular for optimized for business intelligence and data warehousing because it is optimized for reading a few columns of many rows at once (and is not optimized for writing data).
- Popular in the 1990s, object-oriented (OO) DBMSes were designed to work with OO programming languages, similar to NoSQL document stores.
- Pre-relational DBMSes include hierarchical systems -- such as IBM IMS -- and network systems -- such as CA IDMS -- running on large mainframes. Both still exist and support legacy applications.
As you examine the DBMS landscape, you will inevitably encounter many additional issues that require consideration. At the top of that list is platform support. The predominant computing environments today are Linux, Unix, Windows and the mainframe. Not every DBMS is supported on each of these platforms.
Another consideration is vendor support. Many DBMS offerings are open source, particularly in the NoSQL world. The open source approach increases flexibility and reduces initial cost of ownership. However, open source software lacks support unless you purchase a commercial distribution. Total cost of ownership can also be higher when you factor in the related administration, support and ongoing costs.
You might also choose to reduce the pain involved in acquisition and support by using a database appliance or deploying in the cloud. A database appliance is a preinstalled DBMS sold on hardware that is configured and optimized for database applications. Using an appliance can dramatically reduce the cost of implementation and support because the software and hardware are designed to work together.
Implementing your databases in the cloud goes one step further. Instead of implementing a DBMS at your shop, you can contract with a cloud database service provider to implement your databases using the provider's service. This is referred to as DBaaS, or database as a service.
If your site is considering a DBMS, it's important to determine your specific needs as well as examine the leading DBMS products in each category discussed here. Doing so will require additional details on each of the different types of DBMS, as well as a better understanding of the specific use cases for which each database technology is optimized. Indeed, there are many variables that need to be evaluated to ensure you make a wise decision when procuring database management system software.
Learn about some database management rules of thumb from author Craig S. Mullins
See why consultant William McKnight says you should give some thought to in-memory databases