juanjo tugores - Fotolia

Metanautix Quest follows Dremel path to virtual data marts

Metanautix Quest 2.0 is an upgraded query engine and data integration platform. Its designers helped create Google Dremel, a precursor for new big data query tools.

Big data analytics startup Metanautix Inc. has launched a 2.0 version of its Quest query engine, which enables users to build what the company describes as "software-defined data marts" that don't require data to be moved out of existing databases, data warehouses and Hadoop systems.

Quest 2.0 can be used to integrate and process data from a variety of sources, then run SQL-based queries against it and feed the results to Excel spreadsheets or visual analytics tools like Tableau. But Quest isn't a data store itself -- instead, it pulls together data from different systems virtually through a distributed join process, according to Metanautix.

The query engine runs on various cloud computing platforms and can also be deployed on-premises on VMware servers. The first version became generally available last September, and the company said the software has been used by customers for applications, such as customer clustering analytics, scoring of customer profiles for planning multimedia marketing campaigns and online ad optimization.

Primary influence: Google Dremel

As with some other recently released big data querying engines, Quest's design was influenced by Dremel, an ad hoc query tool developed by Google. Dremel was designed to run interactive queries against large data sets to complement MapReduce batch jobs; Google built it for in-house use, but the company described the tool in a widely circulated technical paper published in 2010 and also based its BigQuery cloud analytics  service on Dremel. Facebook's Presto, Cloudera's Impala and Apache Drill also all take some guidance from the Dremel technology.

But Metanautix differs from other Dremel-style tool developers in having Theo Vassilakis aboard as CEO. Vassilakis spent seven-plus years at Google, where he worked as principal engineer and engineering director of a data warehousing and analysis team, and led the Dremel development effort.

Dremel was first and foremost built for parallel processing speed, according to Vassilakis. The original motivation behind the project was to enable Google engineers running ad experiments to get click counts in seconds, even on multimillion record logs. But by bringing SQL, the standard programming language for relational databases, more to the fore, Dremel also allowed more users to get into the querying act.

Big data power to more people

Similarly, Vassilakis and Quest's other designers are looking to open up big data analytics to larger numbers of users, with interfaces to business intelligence (BI) tools like Qlik and Tableau serving as a means. Quest users can program directly in SQL if they know it, or they can use a BI tool to do drag-and-drop operations that kick off query jobs.

And with the virtual data mart capability, they can work with data from different repositories without it all having to be consolidated in a single system. Vassilakis sees big data environments involving the Hadoop ecosystem and other data management platforms having to evolve along the same lines of what he saw happen at Google. "Analysts need a data set, then one data set more, and they need it now," he said. "They can't wait for it to load into a data warehouse."

In an online synopsis of a research note on Metanautix, Tony Baer, an analyst at Ovum Ltd., wrote that the company "faces potentially strong competition" from established database and data integration vendors alike. But, he continued, it offers "the benefit of combining data integration with query and high-performance analytics in the same tool."

Jack Vaughan is SearchDataManagement's news and site editor. Email him at [email protected], and follow us on Twitter: @sDataManagement.

Next Steps

Learn how machine learning is being applied to data preparation problems

Find out about SQL-on-Hadoop as a data prep technique

Read an expert's view on ETL and Hadoop maturity

Dig Deeper on Big data management