More and more mainstream enterprises are starting to employ Hadoop and related technologies, but many run into difficulties navigating the transition to an intricate big data architecture and then making it work to their advantage.
That's according to Merv Adrian, a data management analyst at Gartner. One of the research topics Adrian focuses on is the adoption of open source big data platforms, and the resulting impact on companies. The complexity of deploying a Hadoop environment causes problems for a lot of those users, he said.
In a video Q&A recorded in July 2016 at the Pacific Northwest BI Summit in Grants Pass, Ore., Adrian explained that the "artisanal nature" of Hadoop big data systems makes designing and implementing them a challenge for organizations -- particularly ones without prior experience in big data management. A big data architecture built around Hadoop must be tailored to an organization's specific needs, he said -- but doing so is a granular process that can take a lot of time, effort and skill.
"Hadoop is not a thing, it's a set of things," Adrian said. "When a company sits down to create this particular recipe, they have to decide what the ingredients are going to be, they have to select the various layers -- and even once they know that, there may be alternatives [to consider]."
One of the biggest stumbling blocks that businesses run into when beginning work on a Hadoop environment is not having an end goal in mind, he added. While Hadoop clusters can create big data analytics opportunities with the potential to pay big business dividends, Adrian warned that plunging ahead without a purposeful deployment plan is more likely to result in a convoluted process.
The best way to avoid that problem is to set concrete goals and work backward from them, he said. "Look at the [intended] outcome and decide, 'What are the pieces that I'm going to need to get to it?' A lot of people begin with, 'I've chosen a platform and I'll go from there,' but that's not really the answer."
Watch the video to hear more on the changing climate around Hadoop adoption, the DIY nature of building a Hadoop environment, Adrian's advice on how to manage the deployment process, and the future of Hadoop, the Spark processing engine and other big data technologies.