While some rival vendors work toward an all-encompassing multimodel database, AWS is fielding different databases...
for specific processing tasks. At AWS re:Invent 2018, the cloud giant rolled out two more: Amazon Timestream for time-series data and a transaction ledger database for blockchain-style applications.
These latest AWS database moves arose amid a surge of updates to Aurora, DynamoDB, Redshift and other existing products in its database software arsenal.
Amazon Timestream is designed to handle fast-arriving sequences of data, such as log data and information from IoT sensors and digital controllers that capture telemetry data in manufacturing plants and other industrial environments. Meanwhile, the Amazon Quantum Ledger Database, or QLDB, offers a database alternative to blockchain platforms for users that want a centralized ledger to store business transactions in what AWS called an "immutable" way.
The new technologies expand Amazon's strategy of investing in a set of purpose-built database management system (DBMS) platforms to handle a variety of application needs for AWS users.
In a re:Invent keynote that was streamed live, AWS CEO Andy Jassy said developers -- or builders, as AWS calls them -- should be able to use "the right tool for the job." Using only one type of database makes no more sense than using a hammer to work on every task in every room of a house that's being remodeled, he added.
An industry-wide drive to new microservices-based architectures further compels the move to offer a broad line of databases, Jassy asserted. "Builders are not building in the monolithic ways of the past, but are instead using microservices," he said.
A different database strategy
The approach Jassy described is in contrast to the efforts of other vendors out to take away some of the cloud database market share that AWS has amassed. For example, AWS arch nemesis Oracle includes time-series, geospatial and graph database functions as part of its flagship relational database.
For Microsoft, the effort has been toward an omni-functional approach centered on adding graph and geospatial features to Azure Cosmos DB, which began life as a document-oriented database known as DocumentDB and now is in the multimodel database category.
Gartner analyst Merv Adrian emphasized that the trend underlying the strategy behind the release of Amazon Timestream and QLDB is to diversify database support beyond the common relational model.
"AWS, like other cloud service providers, has assembled a portfolio of DBMS types, both relational and nonrelational," he said. "Now that has come to include specialty types for time-series and ledger uses."
However, the proliferation of database types likely will take time for enterprises to fully absorb, according to Adrian. "Customer strategists and architects now want to know what to use when, and the answer is not simple," he said.
Sequels to NoSQLs
Databases have had a central place in Amazon's march to domination in cloud computing.
Merv Adriananalyst, Gartner
At re:Invent, held in Las Vegas, Amazon CTO Werner Vogels traced the story back to Amazon DynamoDB, the key-value data store that the company first built for internal use in 2006 and then launched as a cloud service in 2012. Vogels argued that DynamoDB triggered the movement toward NoSQL database software that has increasingly become a cloud architecture staple.
Both Vogel and Jassy asserted that simple cloud-based key-value stores can take on a growing role in handling operational data, a job that once fell solely to the relational database.
To buttress those assertions, AWS also unveiled DynamoDB Transactions, which supports full ACID guarantees -- atomicity, consistency, isolation and durability -- for multiple item updates across database tables. An associated DynamoDB On-Demand service is said to automatically manage read/write capacity and help database administrators plan for usage spikes with DynamoDB.
In addition, AWS introduced Amazon Redshift Concurrency Scaling, which is intended to improve query performance against the company's cloud-based data warehouse. AWS also detailed a new data API, revealed earlier this month, for its Amazon Aurora relational database and added an Aurora Global Database feature that enables databases to span multiple AWS regions in the MySQL-compatible edition of the software.
Time-series data in the spotlight
Vogel and Jassy gave special attention to Amazon Timestream, which is going into a limited preview release now. They described it as a nonrelational, fully managed service built specifically to collect, store and process time-series data.
Such data consists of a sequence of data points that are arrayed in consecutive order and tend to be continually updated, or appended, in small chunks. Specialized software for handling time-series data in relational systems has been around for many years, with applications focused on industries such as oil and gas, finance and power plant management. In recent years, adventurous developers have also begun to use NoSQL and Hadoop systems for bespoke time-series data processing.
The arrival of masses of IoT data is expected to push time-series technology into wider use, according to AWS and other proponents.
"All data is fundamentally time-series data -- it has a time stamp. But some systems are only beginning to track it," Ajay Kulkarni, co-founder and CEO of Timescale Inc., said in a phone interview after the rollout of Amazon Timestream. Based in New York, Timescale is the maker of TimescaleDB, an open source time-series database that's built on the PostgreSQL engine and natively supports SQL.
"If you don't track it as time-series data, you're throwing out valuable information," Kulkarni added. "Only by tracking it in its raw time-series format can you really understand what this data tells you."
Room for more in the market?
Not unlike rival vendors to the Amazon Neptune graph database after it was introduced at re:Invent 2017, Kulkarni also said he sees the potential for Timestream to widen the market for time-series software in general.
Besides Timescale and now AWS, the time-series market is populated by vendors of a number of different database types. Arguably, it could include such vendors as DigitalOcean, Imply Data, InfluxData and Paradigm4, among others. Oracle, Microsoft and IBM also all offer support for time-series operations as part of their relational databases.
Taken separately or in multimodel packages, the database cornucopia is getting more complicated, particularly as data increasingly goes to the cloud. Events like re:Invent give vendors opportunities to clarify their strategies, but the onrushing complexity makes it difficult, according to Gartner's Adrian.
"It will be a formidable challenge to explain this cogently and to help customers make the choices that are cost-effective, reliable and usable with the rest of their data fabric," he said.