Petya Petrova - Fotolia

Manage Learn to apply best practices and optimize your operations.

Big data governance steps into IT spotlight as architectures expand

Effectively governing the data stored in different systems across big data architectures is emerging as one of the keys to a successful deployment of Hadoop and related technologies.

This article can also be found in the Premium Editorial Download: Business Information: Effective data visualization crystallizes a company's crystal ball:

With big data architectures typically including a diverse mix of processing platforms and data stores, effectively managing and governing data across them all is becoming a must-do item. But big data governance processes are often still in the early stages of development, and the same goes for software tools that can help support budding governance efforts.

At the University of Texas MD Anderson Cancer Center, data governance is part of the agenda for the next phase of its big data initiative, along with heightened data security measures. "Those are things we didn't focus on initially," as the IT team worked to deploy a Hadoop cluster that began running applications in March, said Bryan Lari, director of institutional analytics at MD Anderson. But a solid governance strategy is "very important" to the ultimate success of the Houston-based healthcare organization's big data deployment, Lari added.

Big data governance is also a key element in managing a Hadoop-based architecture that 22,000 business users at General Electric Co.'s GE Power Services unit tap into via self-service business intelligence tools. "Once big data actually gets big, you've got to deal with it," said Don Perigo, chief enterprise architect at GE Power Services, which is headquartered in Baden, Switzerland.

The big data environment isn't as locked down as an earlier BI and analytics system was, however. "Every time you wanted to do something, you had to ask the warden for permission," said Perigo, who is based in Atlanta. By comparison, the big data platform is governed more on "a Wild West model," he added. "People are free to do what they want, but there is a sheriff" keeping an eye on them. If necessary, the IT team can modify queries so they run more efficiently or shut down user accounts altogether.

The big data governance challenges only get bigger as Hadoop clusters are tied to NoSQL databases, traditional data warehouses and other data repositories. "The danger is that you end up in kind of a chaotic state, where no one has any real idea what's going on across all these data stores," said Mike Ferguson, managing director of U.K.-based consultancy Intelligent Business Strategies.

Information catalogs and metadata management tools could help control the chaos, Ferguson said at the 2016 Pacific Northwest BI Summit in Grants Pass, Ore. But existing tools aren't fully up to the task, he added. "And it's not small holes. There are big gaps."

Next Steps

IT consultants, vendor execs predict BI, analytics and big data developments

Proper configuration and data partitioning make big data systems run smoothly

Consultant Anne Marie Smith discusses governing unstructured data

This was last published in October 2016

Dig Deeper on Big data management

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

How is your organization approaching governance of its big data analytics architecture?
Cancel

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSQLServer

Close