How to select the best DBMS software: A buyer's guide
A collection of articles that takes you from defining technology needs to purchasing options
Database management systems are vital components of the modern IT infrastructure. They support the storage, usage and manipulation of data for a wide range of enterprise applications. Operational database management systems are used to support the data persistence layer for modern applications that drive transactions and other types of interactions to support enterprise-level business requirements.
For decades, the market has been dominated by the big three relational database management system (RDBMS) vendors -- Oracle, IBM and Microsoft -- but market dynamics have caused an upheaval, allowing for greater competition.
Relational products still lead, with the big three RDBMS products comprising the largest percentage of revenue and install base in the market. But the leaders are being challenged by new and existing competitors, as well as by relational and nonrelational competitors.
As organizations embrace digital transformation with mobile and cloud-capable applications, they encounter different types of data requirements that aren't optimally supported by the leading relational products. This has enabled NoSQL and in-memory database management systems (DBMSes) to gain market share. Additionally, it has caused the market leaders to enhance their offerings to support additional data models and engines.
Given the dizzying number of competing operational DBMS products, it can be confusing to match application requirements to an ideal operational DBMS, especially with so many choices (currently, there are more than 250 products ranked by DB-Engines). Here's an overview of the leading operational database management systems in the market to help enterprises get started.
Aerospike is an open source, in-memory NoSQL DBMS. It's a key-value data store designed to deliver high performance and rapid data access for real-time big data applications.
This NoSQL DBMS provides a simplified environment and setup for developers building and operating modern applications at scale, with minimal upfront administrative work. It can make sense to use Aerospike for caching data, storing session information or when personalizing user experiences on web portals and mobile applications.
Aerospike can be licensed as open source or commercially. The commercial edition adds more features, which are unavailable in the open source edition, as well as technical support. Aerospike runs on Linux.
Support is available for many different Linux distributions, including prebuilt binaries for Red Hat, Ubuntu, CentOS and Debian.
The DynamoDB NoSQL cloud database as a service supports both document and key-value store models, providing flexibility for development of web, gaming, internet of things and many other types of applications. Amazon DynamoDB is designed to provide high performance at a large scale with low latency.
Amazon DynamoDB eliminates the need to handle tasks such as hardware and software provisioning, setup and configuration, software patches, upgrades, operating a distributed database cluster or partitioning data over multiple instances as it scales.
DynamoDB is a service; there's no notion of a database server or schema. Its core components are tables, items and attributes. The highest-level object in DynamoDB is a table, but not a traditional relational table; rather, it's a table that's composed of as many items as you want.
All the data is stored on solid-state drives and is replicated three ways across different availability zones, thereby delivering redundancy and fault tolerance.
Apache Cassandra is an open source, distributed key-value NoSQL DBMS. It was originally developed at Facebook, and was later released as an open source project. Additionally, a free packaged distribution of Apache Cassandra -- DataStax Community Edition -- and a commercial edition are available from DataStax.
Apache Cassandra was created for online applications that require fast performance and no downtime. It works best when most, if not all, access is to look up data based on a primary key value. It was built to handle very large amounts of data spread out across commodity servers and to deliver high availability without a single point of failure.
Available for Linux, Windows and Mac OS X operating systems, Apache Cassandra is open source and free to download.
PostgreSQL, an open source RDBMS, serves as the foundation for the EnterpriseDB (EDB) Postgres Platform. PostgreSQL is made available under the terms of the PostgreSQL License. The EDB Postgres Platform from EnterpriseDB is offered as a subscription service for production and nonproduction systems.
The platform is open source-based and brings together multiple components for managing structured and unstructured data in a federated model. It comprises a DBMS, three fully integrated tool suites, a range of deployment options, and support and services.
Oracle users looking for a less costly option can take advantage of the EDM Postgres platform's Database Compatibility Technology for Oracle, which allows users to switch existing applications from Oracle to run on EDB Postgres, with minimal to no changes required.
IBM DB2 is a relational DBMS with strong availability and performance capabilities. Along with its relational/SQL core, DB2 also boasts integrated support for a number of NoSQL capabilities, including XML, graph store and Java Script Object Notation, or JSON.
Used by organizations of all sizes, DB2 provides a data platform for both transactional and analytical operations, as well as continuous availability of data to keep transactional workflows and analytics operating efficiently.
DB2 is available for Linux, Unix and Windows (LUW) workstations, on IBM iSeries midrange computers and on IBM mainframes running z/OS. It's the only leading operational DBMS with native mainframe support.
DB2 boasts strong hybrid transaction/analytical processing capabilities with BLU acceleration column store support on LUW platforms, as well as tight integration with the IBM DB2 Analytics Accelerator on the mainframe. As such, it is well-suited for organizations that need to use a single DBMS to run mixed transactional and analytical workloads.
Pricing for DB2 LUW is based on the processor value unit (PVU), which is the unit of measure that IBM uses to license its software. IBM applies a PVU count to each core of a processor, and the pricing is based on the total number of PVUs made available to DB2. DB2 for z/OS is licensed using the IBM Monthly License Charge model.
The MarkLogic Server NoSQL DBMS is designed to make heterogeneous data integration easier and faster using a combination of enterprise features.
MarkLogic Server is a document-based DBMS that can perform a complex search and query across multiple types of data, including documents, relationships and metadata. It can handle data such as JSON, XML and Resource Description Framework natively, and offers enterprise features such as ACID (atomicity, consistency, isolation and durability) transactions, automated failover and security.
MarkLogic Server's Search & Query capability lets users search through billions of text documents with subsecond response times. Search indexes scan metadata and relationship data inside the document and set up automatic alerting.
There are multiple editions of MarkLogic Server for varying levels of enterprise functionality and support, as well as a free edition for developers.
Microsoft SQL Server
Microsoft SQL Server 2016 is a relational DBMS for Windows platforms that can be used for building, deploying and managing applications located on premises or in the cloud.
Microsoft SQL Server 2016 provides strong analytics, in-memory processing and security capabilities.
Organizations looking to extend their relational databases to the cloud can benefit from the Stretch Database feature of SQL Server, which can be used to store some data on premises and to send infrequently used data to Microsoft's Azure Cloud. Applications using the database can access all of the data regardless of where it's stored.
Although Microsoft SQL Server 2016 is well-known for its strong support and integration with the Microsoft Windows operating system, Microsoft began offering SQL Server on Linux in 2016, enabling the DBMS to compete on non-Windows platforms.
SQL Server 2016 can be licensed based on the number of users and devices that access SQL Server, or per core, which offers a more precise and consistent measure of computing power, regardless of whether SQL Server is deployed on physical servers, on premises, virtually or in the cloud.
MongoDB is an open source document store, NoSQL DBMS designed for running modern applications that rely on structured and unstructured data with a flexible schema and rapidly changing data requirements. MongoDB is designed to make it easier for organizations to develop and run applications that address performance, availability and scalability, and to support a variety of data types.
MongoDB's document data model lets developers easily store and combine data of any structure, without sacrificing data access or indexing functionality. This enables database administrators to dynamically modify the schema with no downtime.
MongoDB is licensed both as open source, under the GNU Affero General Public License, and as a commercial offering. The commercial edition, MongoDB Enterprise Server, is available as part of the MongoDB Enterprise Advanced subscription, which adds advanced security, administrative features, support and on-demand training not available in the open source edition.
MySQL is a popular open source RDBMS known for its ability to support web-based and online publishing applications, in addition to a wide range of applications.
Although MySQL doesn't have the same range and span of features and functionality as the big three RDBMS offerings (Oracle Database, Microsoft SQL Server and IBM DB2), it generally costs less and is easier to deploy.
MySQL runs on most Linux, Unix and Windows platforms. Although it can be used in a wide range of applications, MySQL is most often associated with web-based applications and online publishing. MySQL is the M in the open source enterprise LAMP stack.
MySQL is owned by Oracle. Developers can use MySQL under the GNU General Public License, but commercial organizations must buy a commercial license from Oracle to deploy MySQL.
Neo4j is a native graph database system that provides valuable insight based on data relationships built into the fabric of the product, including the data model, query language and storage engine. Much like how RDBMSes are founded on a mathematical basis (set theory), graph database systems are built on the mathematical foundation of graph theory.
The Neo4j graph DBMS delivers high performance and availability, with its native graph capabilities for data storage and access.
It's well-suited for applications and data where the relationship between the data elements is as important as the data itself. Example use cases include social media connections, delivery routing and dispatching, public transportation links, curriculum prerequisites, network topologies and recommendation engines, such as those used by online retail sites.
Neo4j data and its connections are physically stored as relationships. The database engine relates data by following pointers from data point to related data point, providing faster processing than relational joins or writing joins in other NoSQL databases.
Data relationships are stored and processed as they occur, providing quick responsiveness and flexibility when making database changes and creating Agile development.
Oracle Database 12c
The overall market leader, Oracle Database 12c, is an RDBMS designed for both on-premises and cloud uses. It can be deployed on a choice of clustered or single servers. A comprehensive feature set enables Oracle Database 12c to be used to support multiple types of applications, including transaction processing, business intelligence and content management applications.
Oracle's multi-tenant architecture simplifies the process of consolidating databases in the cloud, enabling customers to manage many databases as one, without changing their applications. A single multi-tenant container can host and manage hundreds of pluggable databases to dramatically reduce costs and to simplify administration.
Oracle Database 12c provides a Database In-Memory Column Store, which can be used to boost the performance of database queries. Existing applications can automatically and transparently take advantage of in-memory processing without needing any changes or losing any existing Oracle Database capabilities.
Oracle Database 12c is available primarily in three editions. Perpetual and term licenses are based on the number of named users and devices that will have access to the software, or the number of processors on which the database will run.
Redis is a lightweight, flexible, key-value store, open source DBMS. It provides a highly scalable data store that can be shared by multiple processes, applications or servers.
A key-value database is ideal when almost all of the access to data is requested using a key, such as when looking up product details by a product number. The details can be any type of information -- and can even vary from product to product.
Redis is frequently used for applications with high-availability and low-latency requirements, such as gaming, retail and mobile. The schema flexibility of key-value databases such as Redis helps users to excel at session management, serving ad content and managing user or product profiles.
Redis is available as open source via a BSD license or as an enterprise edition available both as Redis Labs Enterprise Cluster and Redis Cloud.
Riak from Basho Technologies is a fault-tolerant, highly available, scalable, distributed multimodel, NoSQL DBMS. It enables application developers to store, manage and secure unstructured data. The DBMS is designed to enable storage of and access to various types of unstructured data that requires continuous availability.
Riak is a good choice for supporting highly scalable applications that access large amounts of unstructured data, and that require around-the-clock availability. It's designed to support fast development and ease of operations. The key-value store enables storage and access of various types of unstructured data at a massive scale with high availability.
However, Riak is more accurately referred to as a multimodel platform, supporting key-value, object store and search, all from the same platform.
The Riak DBMS can be deployed to multiple servers, and it provides continuous functionality in the presence of hardware and network failures. Riak is available in three versions: open source, supported enterprise and cloud storage.
SAP HANA is a column-oriented, in-memory RDBMS. HANA is architected to enable applications to support both transactional and analytical processing on a single system with one copy of the data. It's designed to handle high transaction rates and complex queries.
Running on SUSE Linux and Red Hat Enterprise Linux, SAP HANA enables real-time analytics on transactional systems at a large scale and on a variety of data, including structured, unstructured, spatial, time series and streaming data.
It provides features that support development for SAP and custom-built applications. SAP HANA combines database, advanced analytics, enterprise information management and application server capabilities, all running in-memory, on one data copy and on a single platform.
SAP HANA supports multi-tenancy and data tiering, which enables petabyte-scale deployments for warm data (data that's less frequently accessed) to be stored on the disk, and offers a choice of deployment models and partners.
The DBMS can be deployed on-premises, in the cloud or as a hybrid of both.
Which operational DBMS is right for you?
The products covered in this article all provide core operational DBMS capabilities. Varying application and enterprise requirements will make certain categories of DBMSes and individual products stronger or weaker.
Be sure to consider and understand the storage and access mechanisms, the data consistency capabilities (ACID vs. BASE) and the total cost of ownership of the DBMS over its usage lifetime as you evaluate your operational DBMS choices.
Also, be aware that many organizations use multiple operational database management systems, not just one, to fit disparate application requirements (also known as polyglot persistence).
In the end, it's up to each organization to review the products closely and determine which best meets its needs. These overviews provide a good starting point for determining which of the operational database management systems are the best fit for your company's project requirements.
Explore DBMS options before buying
How NoSQL software eases big data management issues
Tools that can help improve the performance of DBMSes