BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Hadoop is big with big data vendors -- and some users, particularly Internet companies looking to collect and store large amounts of Web data. But in a video interview recorded at the 2014 TDWI BI Executive Summit in Boston, consultant Rick Sherman said organizations should make sure they have a real business need for the open source distributed processing framework before launching Hadoop projects. "Just don't do it to do it -- do it because there's a reason," said Sherman, who is founder of Athena IT Solutions in Maynard, Mass.
Whether a Hadoop cluster deployment is warranted depends partly on the kind of data a company has to contend with, according to Sherman. He said Hadoop is best suited to a variety of unstructured data that mainstream relational databases typically aren't adept at handling -- for example, social media and sensor data. Even then, proponents of the technology should justify an investment in a Hadoop implementation up front, Sherman said: "Think about the business case, [and] get the business committed to it."
And because Hadoop is still maturing, with a Hadoop 2 release just having become available late last year, companies that lack a real need now could benefit from deploying it later rather than sooner. "If you don't have a compelling [business] case now, and you're going to wait for a while, that might not be a bad thing," Sherman said. "Let other people test out the wheels."
In the interview, Sherman also said he isn't seeing many Hadoop clusters being deployed to replace existing relational databases and data warehouses. More common, he added, is a commingling of unstructured Hadoop data and structured information from a data warehouse -- mixing text data showing customer sentiment with purchase-history transaction records as part of a customer analytics initiative, for example. Such capabilities are "expanding what we can do" in business intelligence and analytics, he said.
Watch the five-minute video to hear more of what Sherman had to say about the merits of Hadoop projects and issues to consider before committing to the deployment of a Hadoop platform.