How to select the best DBMS software: A buyer's guide
A collection of articles that takes you from defining technology needs to purchasing options
Once you've evaluated the different types of database management system products, the next step is deciding which best fits your needs: a relational database management system (RDBMS), NoSQL or in-memory. Here we examine the relational DBMS, which continues to be the leading DBMS category; separate articles will examine NoSQL and in-memory DBMSes.
Relational DBMS strengths
When purchasing a DBMS, it's wise to evaluate RDBMS options first, as they're practicable across a wide variety of use cases. Their sound theoretical foundation helps protect and ensure ongoing access to data for multiple types of applications. Relational theory, developed by Ted Codd at IBM in the early 1970s, is based on the mathematics of set theory, which delivers rigor and accuracy to data access and manipulation. The mathematical basis and foundational theory of relational technology is unique within the world of database systems.
Most types of middleware, software integration products and management tools are available for the RDBMS; however, this isn't always the case for other, emergent forms of DBMS products.
Also, there are large numbers of SQL programmers available to support development efforts for RDBMSes. Although not required, all RDBMSes use SQL as a standard data access language. Despite variations in implementation of the SQL language from DBMS to DBMS, most of the language elements are the same, regardless of the RDBMS being used.
Perhaps the strongest capability of the RDBMS is its robust implementation of transactional atomicity, consistency, isolation and durability (ACID). The ACID properties guarantee that database transactions are processed reliably. Therefore, a transaction, when executed on a relational database, will either complete, producing correct and up-to-date results, or terminate, with no effect. In either case, the resulting condition of the database will be a consistent state.
Delivering ACID support requires processing overhead and can reduce data availability because subsequent transactions must wait until changed data is committed to the database. This is a fair tradeoff when dealing with business-critical data such as financial transactions. Banking data must be up-to-date and accurate at all times, as is the case with other types of critical production data such as certain types of healthcare, stock trading, insurance and other regulatory controlled data.
To be fair, even though rigid consistency is the default for RDBMS products, most allow database managers to control the level of consistency using parameters and code to adjust isolation and locking levels.
Potential relational DBMS shortcomings
As more types of applications are built that require different types and amounts of data, an RDBMS can struggle to support such requirements. Social media data, streaming audio and video, and information from the Internet of Things (IoT) comprise different data content that requires more flexibility than the traditional RDBMS can comfortably support.
Of course, there are additional areas of concern when evaluating whether to acquire an RDBMS. One of the biggest hurdles is the high cost of acquisition. The purchase price of an RDBMS ranges from several thousand dollars to more than $1 million, depending on the amount of data or size of the computer on which you intend to run the database.
Feature bloat is another issue with relational database offerings. Although their longevity brings robust, time-tested functionality, RDBMSes also come with built-in features that may not be needed. Software with feature bloat can be difficult to learn and support. These additional features can also bog down performance (as opposed to using a DBMS without all of those features that's designed and optimized for a specific use case).
A final weakness that can also be viewed as a strength -- depending on the particular use case -- is the rigid schema required by relational databases. All of the columns must be predefined with a specific data type and length before a table in a relational database can be used. This improves data integrity because only data with the proper type and length is permitted to be stored in the database. But newer, NoSQL database systems don't have such a requirement, allowing developers to adapt schemas based on changing data needs (while also potentially causing data integrity issues).
Application support issues
When evaluating a DBMS, it's important to consider how databases will be used to support your applications. Do you have many applications with traditional transactions or a lot of batch processing? Or are you developing Web 2.0 applications and supporting the IoT?
Most traditional data management and analysis applications, including online transaction processing, or OLTP; batch processing; mixed workloads; and business intelligence, are valid use cases for relational databases. Other specific criteria that indicate relational should be the preferred data model include when there are consistent data structures and definitions; when data integrity and accuracy must be immediate; and for processing traditional data types such as numbers, dates and alphanumeric values.
It's wise to default to the RDBMS as the standard DBMS type and veer from relational only for projects that aren't ideal for current relational capabilities, such as supporting Web 2.0 projects, streaming data and big data analytics with varying schema requirements.
Relational DBMS purchasing criteria
When purchasing any type of DBMS, there are several standard criteria you should consider during a request for proposal (RFP) and evaluation period.
First is the architecture of the DBMS and its suitability for the project(s). The architecture of the RDBMS is suitable for most data management needs, but can be problematic for projects requiring flexible schemas or complex inter-relationships between data elements.
Also consider the availability and robustness of database administration requirements and facilities. Backup and recovery, change management and performance management are among the most crucial administrative capabilities. RDBMS products offer good built-in administration functionality; there are also many add-on products available.
Deployment -- including installation requirements, prerequisite hardware and software, and virtualization capabilities -- should be considered, too. Most RDBMSes offer strong procedures and capabilities in each of these areas.
Also be sure to factor the availability of skilled personnel into your DBMS selection process. Gauge the availability and skill levels of both database administrators and application developers. Consider years of experience and technical certifications in judging skills. Relational skills are easier to hire for than other, nascent database technologies.
Perhaps one of the most important components of measuring the potential effectiveness of a DBMS for your requirements is performance and benchmarking. However, it's also notoriously difficult to gather appropriate performance details. Standard benchmarks are available from the Transaction Processing Performance Council, or TPC, but these quickly become dated and a benchmark is not usually a reliable indicator of the actual performance of an end-user implementation. Furthermore, some vendors (notably Oracle) contractually prohibit their customers from publishing performance details about their database applications. Therefore, your best course of action is to review the published benchmarks and ask the vendors to provide reference customers. Alternately, you could set up a trial of the RDBMS and develop mock-up applications to determine the performance of the DBMS on your data, but that can be very time-consuming.
Since a DBMS is a long-term investment, its scalability for managing growth in data, users or processes is another important consideration. Ensuring that a DBMS can manage growth typically involves the ability to partition data across nodes in a distributed system. Additional aspects of scalability include understanding how the product adapts to upgraded hardware and knowledge of its architectural limits. RDBMS products offer good scalability, but for very large amounts of data, alternative DBMS offerings may be more suitable.
And finally, be sure to review the fault tolerance of each DBMS under consideration. A DBMS should be able to withstand coding and logic errors without failing. Furthermore, a DBMS requires multiple components to deliver processing services. A fault-tolerant DBMS should be capable of continuing to operate, possibly at a reduced level, rather than failing completely, when one of its components (or a component it uses) fails.
Seven additional relational DBMS factors and trends to consider
Multiple editions. Many RDBMS products are available in multiple editions with varying functionality from edition to edition. For example, there may be a desktop, workgroup and enterprise edition of each RDBMS. After choosing the specific RDBMS, you must choose the edition that supports the features you need. The best approach is to work with the vendor to understand the capabilities of each edition.
Functionality based on platform or operating system (OS). Another consideration is that the same RDBMS version may have different functionality depending on the platform or OS on which it will be run. For example, the Windows version may lack features the Unix version supports, or the z/OS (mainframe) version may contain features not supported on other OSes. If your organization supports multiple operating environments, be sure you understand the features and limitations of the RDBMS you choose for each specific platform.
New capabilities and features. One of the more recent capabilities introduced is hybrid database engine support, or the ability to use multiple data structures and access techniques. For example, the core relational database engine may be able to be replaced -- or augmented -- by a column store or a NoSQL document store. Hybrid capabilities extend the functionality of the RDBMS to more use cases, making a stronger case for their ubiquity.
Open source options. Although the RDBMS market is driven by large commercial technology companies, there are also open source RDBMS options for companies wary of dealing with large vendors or for those that need a more cost-conscious approach. Some of the open source options provide full-functionality RDBMS offerings, whereas others offer scaled-down functionality. Sometimes core functionality can be missing from an open source option, such as referential integrity or check constraints. When choosing an open source RDBMS, be sure to perform the due diligence necessary to determine if the features (or lack thereof) match your specific needs.
In-memory data management. Processing data in system memory is another trend that has recently become more popular in DBMSes. With in-memory processing, data is stored and manipulated in memory instead of on spinning disk. This can significantly improve the performance of data access. Some RDBMSes are built from the ground up for in-memory processing, whereas others have adapted existing capabilities to operate in-memory. There are also in-memory DBMS options that aren't relational (this topic will be covered in more detail in a related article).
The cloud. As storing more and more data becomes a reality at many shops, database systems that store data in the cloud are becoming more pervasive. Many cloud database options are of the NoSQL variety, but many vendors offer relational cloud database capabilities and services, too. One option is to subscribe to a database as a service (DBaaS) offering, where a service provider hosts your data on its cloud database platform. DBaaS can be a lower-cost overall solution for small to medium-sized businesses looking to take advantage of enterprise database capabilities.
The database appliance. Another trend in the RDBMS market is the emergence of appliances, which combine hardware and software prepackaged and installed to function properly. The biggest benefit of the database appliance is its turnkey packaging: You buy it, plug it in and it works. Of course, that's true to varying degrees, depending on the specific database appliance you acquire.
The relational DBMS is the lynchpin of most existing IT systems and applications, and will continue to dominate the database landscape for years to come. Although the RDBMS ecosystem is dominated by some of the largest technology companies in IT, the market is far from simple. Understanding the relational world requires more than a high-level review of what IBM, Microsoft and Oracle have to offer.
But remember, even though the relational DBMS is the leading type of DBMS, it is not the only type worthy of consideration. It's important to examine both NoSQL and in-memory DBMSes as well.
About the author:
Craig S. Mullins is a data management strategist, researcher, consultant and author with more than 30 years of experience in all facets of database systems development. He is president and principal consultant of Mullins Consulting Inc. and publisher/editor of TheDatabaseSite.com. Email him at firstname.lastname@example.org.
This article was updated in September 2016.
Making the decision: Oracle RDBMS vs. NoSQL
When to use a DBMS or data warehouse