SQL-on-Hadoop

SQL-on-Hadoop is a class of analytical application tools that combine established SQL-style querying with newer Hadoop data framework elements.

SQL-on-Hadoop is a class of analytical application tools that combine established SQL-style querying with newer Hadoop data framework elements. 

By supporting familiar SQL queries, SQL-on-Hadoop lets a wider group of enterprise developers and business analysts work with Hadoop on commodity computing clusters. Because SQL was originally developed for relational databases, it has to be modified for the Hadoop 1 model, which uses the Hadoop Distributed File System and Map-Reduce or the Hadoop 2 model, which can work without either HDFS or Map-Reduce.

The different means for executing SQL in Hadoop environments can be divided into (1) connectors that translate SQL into a MapReduce format; (2) "push down" systems that forgo batch-oriented MapReduce and execute SQL within Hadoop clusters; and (3) systems that apportion SQL work between MapReduce-HDFS clusters or raw HDFS clusters, depending on the workload.

One of the earliest efforts to combine SQL and Hadoop resulted in the Hive data warehouse, which featured HiveQL software for translating SQL-like queries into MapReduce jobs. Other tools that help support SQL-on-Hadoop include BigSQL, Drill, Hadapt, Hawq, H-SQL, Impala, JethroData, Polybase, Presto, Shark (Hive on Spark), Spark, Splice Machine, Stinger, and Tez (Hive on Tez).

 

 

This was first published in July 2014

Continue Reading About SQL-on-Hadoop

Dig deeper on Hadoop framework

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSOA

SearchSQLServer

Close