Sergey Nivens - Fotolia

MapR-DB uses NoSQL to drive real-time analytics

The Strata conference in New York saw big data platform vendor MapR Technologies update MapR-DB, its NoSQL database engine, to better perform in real-time analytics applications.

The NoSQL engine sometimes takes a back seat to Spark and Hadoop analytics in discussions of big data. But independent Hadoop distribution provider MapR Technologies is placing its MapR-DB NoSQL database at the center of updates to its multimodel Converged Data Platform, as disclosed this week at the Strata Data Conference in New York.

New in MapR-DB are solid-state-drive-optimized secondary indexes that natively handle several data types. Also, indexing is enhanced for in-place integrations with the Drill interactive SQL query engine that Map-R fields for those asking questions of Hadoop and NoSQL stores.

These and other new traits -- an updated API supporting JSON grammar and NoSQL integration with a cross-data-center change data capture scheme -- are part of MapR's flagship platform update. They are intended to meet needs of users scaling out varied stores to deal with more and more data. They are also meant to speed processing and to move analytics into an organization's real-time workflow.

Xactly speeding querying

For Xactly, a sales performance management company in San Jose, Calif., with software running on the cloud and in data centers, the enhancements to MapR-DB help to move analytics closer to real time and deeper into operational workflow.

At Xactly, the MapR-DB engine processes JSON data that, together with Spark and other Hadoop-oriented components, creates views for analytics of salespeople's activity, according to Ron Rasmussen, CTO, product officer and senior vice president of engineering.

"As fast as MapR-DB is, we'd like to see it go faster," he said, adding that his early view of the secondary indexing now part of MapR-DB shows a significant speed increase of querying capabilities. That is important to developers building interactive querying support into Xactly's services.

A NoSQL engine like MapR-DB is oriented toward handling streaming data and processing in parallel, so it is useful within an enterprise, according to Mike Matchett, an analyst and consultant at Taneja Group in Hopkinton, Mass.

"The point is to get the streaming data in the hands of people that can use it," he said. He also marked features like MapR-DB's new secondary indexing as key in many applications. "Indexing, and speed of indexing, is very important," he said.

Matchett pegged Apache Cassandra and CouchbaseDB among competitors to MapR-DB.

Big data maturation

"What is good about the MapR platform is its scalability and reliability," Rasmussen said. He noted Xactly presently uses MapR-DB as a distributed data store in a 15-node cluster configuration.

MapR's Jack Norris, senior vice president for data and applications, said the use of NoSQL databases in big data workflows is part of the general speed-up of business activity in the age of the web, and artificial intelligence techniques such as machine learning for prediction are part of this, too.

Jack Norris, MapRJack Norris

"People are interested in the ability to do in-place machine learning, handling real-time data ingestions and outflows, and getting answers in seconds," he said.

For Xactly's Rasmussen, NoSQL improvements and other steps represent greater maturation of big data technologies, which continue to add features across a spectrum of data types and data-processing approaches. The maturation is both on the vendor and user side, he indicated. "Three years ago, we had challenges. It all was new," he said "Now, we have three-plus years' experience as a team. Our people are trained."

Next Steps

Be there as CTOs experiment in early days of real-time big data

Learn about the roles today for big data pipelines

Find out more about multimodel databases that include NoSQL capabilities

Dig Deeper on Hadoop framework