michelangelus - Fotolia
The Talking Data Podcast for this week finds reporter Jack Vaughan newly returned from the Hadoop Summit 2014 in San Jose, California, where a major theme was SQL-on-Hadoop. Vaughan tells podcast moderator Ed Burns that this is one of the more important trends in big data today, as SQL support could well decide Hadoop's ultimate role in an enterprise.
Technologies like Hadoop arise and capture technologists' imaginations for their newness and exciting characteristics. Then technologists "set out to add all the unexciting characteristics of the technology they are replacing," according to Vaughan, who listed some of the various means (such as HDFS/Hbase tools that are provided by vendors, self-created SQL queries via Hive, or Hadoop distribution-specific SQL tools) used lately to bring interactive SQL-style queries to the Hadoop data platform. Once posed as an SQL alternative, Hadoop is now seen as a supplement to SQL in many quarters.
The move to SQL support is not surprising given this data management format's ubiquitous position in organizations. The SQL programmer army is vast, as are business intelligence tools that allow legions of business analysts to field reports and perform what-if scenarios on corporate data. But vendors come at the SQL-Hadoop integration problem from different perspectives, making the data manager's selection task daunting.
Keynote speaker Tom Davenport also addressed the crowd at the Hadoop Summit. Davenport, who is a long-time industry observer and author of the recently released book Big Data @ Work, said the Hadoop that arose in high-profile young Web companies is now finding use in more traditional enterprises, especially ones that are what he described as "data driven."
Bank of America, WellPoint, Staples and other such players are deep in the Hadoop ditches, Davenport and others report. But, as the SQL-on-Hadoop movement suggests, they are looking to work with Hadoop using familiar SQL tools.
Read more news from the Hadoop Summit
Learn best methods for managing Hadoop projects
Find out what SQL databases are doing to keep up