While competitors Cloudera Inc. and Hortonworks Inc. have just added products that work outside the conventional Hadoop platform, a different tack marks the latest efforts of independent Hadoop distribution provider MapR Technologies Inc.
The company is enhancing its MapR in-Hadoop NoSQL database by bringing native JSON format support inside its Hadoop distribution. That effectively brings a NoSQL document database closer to the company's versions of the Spark analytical engine and Hadoop.
Document databases are finding increasing use as operational stores attached to a variety of applications, according to Jack Norris, chief marketing officer at MapR, based in San Jose, Calif. "Now that we support native JSON, you can run applications side by side, whether they are key-value stores or stores for JSON documents," Norris said.
For MapR, he said the benefits of running JSON natively in its HBase NoSQL database are less data copying and less data movement between Hadoop and the NoSQL database.
The first beneficiaries are "MapR customers that want to do everything they can in the MapR DB," said Doug Henschen, vice president and principal analyst at San Francisco-based Constellation Research Inc. Clearly, "everything" includes JSON, which continues to gain as a data format for Web and Internet of Things data processing.
Teradata Database goes to Amazon cloud
One of data warehousing's traditional leaders is taking its Teradata Database to Amazon Web Services. Teradata will run on multi-terabyte Amazon Elastic Compute Cloud (EC2) instances.
For Teradata Corp., which has long provided the Teradata Database on premises -- and which has more recently offered the database on its own cloud service -- the move is an admission that more and more data is showing up on Amazon's cloud.
"We believe in data gravity. This is about moving analytics to where the data is," said Chris Twogood, vice president of product and services marketing for Teradata, based in Dayton, Ohio. The effect of gravity on data, he explained, is that data can be very expensive and difficult to move around. He said users want to avoid moving data around when it is possible. So, while Teradata can support hybrid on-cloud/on-premises deployments, it also sees the need to be on the leading cloud -- Amazon.
"Most of customers will opt for a hybrid cloud architecture," he said. "But some people will do a full data warehouse in the cloud."
Teradata on Amazon Web Services shows promise, despite headway Amazon has made with its own Redshift data warehouse on the cloud, according to Tony Cosentino, vice president and research director at Ventana Research in Bend, Ore.
"The Teradata Database on Amazon is a path that provides a way [for businesses] to move forward. They are coming in with a much more mature platform than Redshift," he said. Still, he added, "as you move to cloud, you cannot not talk about Redshift."
Cosentino said the big challenge for data in the cloud revolves around integration, and much of that integration challenge centers on connecting cloud and on-premises analytics.
Learn about Oracle GoldenGate updates
Find out about Hortonworks' Hadoop management initiative