Business Information

Technology insights for the data-driven enterprise

Petya Petrova - Fotolia

Big data governance steps into IT spotlight as architectures expand

Effectively governing the data stored in different systems across big data architectures is emerging as one of the keys to a successful deployment of Hadoop and related technologies.

With big data architectures typically including a diverse mix of processing platforms and data stores, effectively managing and governing data across them all is becoming a must-do item. But big data governance processes are often still in the early stages of development, and the same goes for software tools that can help support budding governance efforts.

At the University of Texas MD Anderson Cancer Center, data governance is part of the agenda for the next phase of its big data initiative, along with heightened data security measures. "Those are things we didn't focus on initially," as the IT team worked to deploy a Hadoop cluster that began running applications in March, said Bryan Lari, director of institutional analytics at MD Anderson. But a solid governance strategy is "very important" to the ultimate success of the Houston-based healthcare organization's big data deployment, Lari added.

Big data governance is also a key element in managing a Hadoop-based architecture that 22,000 business users at General Electric Co.'s GE Power Services unit tap into via self-service business intelligence tools. "Once big data actually gets big, you've got to deal with it," said Don Perigo, chief enterprise architect at GE Power Services, which is headquartered in Baden, Switzerland.

The big data environment isn't as locked down as an earlier BI and analytics system was, however. "Every time you wanted to do something, you had to ask the warden for permission," said Perigo, who is based in Atlanta. By comparison, the big data platform is governed more on "a Wild West model," he added. "People are free to do what they want, but there is a sheriff" keeping an eye on them. If necessary, the IT team can modify queries so they run more efficiently or shut down user accounts altogether.

The big data governance challenges only get bigger as Hadoop clusters are tied to NoSQL databases, traditional data warehouses and other data repositories. "The danger is that you end up in kind of a chaotic state, where no one has any real idea what's going on across all these data stores," said Mike Ferguson, managing director of U.K.-based consultancy Intelligent Business Strategies.

Information catalogs and metadata management tools could help control the chaos, Ferguson said at the 2016 Pacific Northwest BI Summit in Grants Pass, Ore. But existing tools aren't fully up to the task, he added. "And it's not small holes. There are big gaps."

Article 6 of 9

Next Steps

IT consultants, vendor execs predict BI, analytics and big data developments

Proper configuration and data partitioning make big data systems run smoothly

Consultant Anne Marie Smith discusses governing unstructured data

Why some IT teams take big data security issues into their own hands

Dig Deeper on Big data management

Get More Business Information

Access to all of our back issues View All