During a presentation at the 2015 BI Leadership Summit in New York this month, consultant Joe Caserta asked how...
many attendees had active data governance programs in their organizations. About half of the 90 or so people in the audience raised their hands. Then, he asked how many were fully governing all of their data. Not a single hand went up. "Zero percent -- I think that's statistically significant," Caserta quipped.
Effectively governing the data used in BI and analytics applications was a big discussion topic at the conference, which was co-hosted by TechTarget's SearchBusinessAnalytics website and consultancy Eckerson Group. According to Caserta and other speakers, the data governance process is being challenged by two ongoing trends: the shift toward self-service BI tools that often are deployed by individual business units, and the growing emphasis on big data analytics initiatives that tap new types of data and technologies such as Hadoop and Apache Spark.
In his opening keynote, Wayne Eckerson, principal consultant at Eckerson Group, said that power, influence and budget money are flowing away from the BI teams in many companies, with business units and departments increasingly getting autonomy on analytics purchases and investments. "BI leaders are on the wrong side of the pendulum swing now," he noted.
But Eckerson added that BI managers and developers still have a crucial role to play -- and that leading data governance efforts should be part of that role. "In most organizations I go into, [data governance] is a huge pain point," he said. "And until companies can achieve a common data vocabulary, they're not going to get very far on effective analytics."
Wayne Eckersonprincipal consultant at Eckerson Group
There's likely no going back to a fully centralized BI structure in organizations that have gone the self-service route. But Eckerson said that in addition to shepherding the development of universal data definitions and usage guidelines, BI managers can work to foster more unified analytics processes by creating a data catalog to help analysts and business users find the information they need. Another to-do item that he cited: communicating a big-picture view of different technology options, and their pros and cons.
Ideally, Eckerson said, BI teams will "be somewhere in the middle" of all the separate analytics activities in organizations, "and everyone will be happy." Otherwise, he warned, the end result could be "data silos and spreadmarts, and this chaotic environment where nothing adds up."
Proper balance required on governance
Striking the right balance on a BI data governance initiative can be tricky, though. Forrester Research analyst Boris Evelson, who also spoke at the conference, recommended that the bulk of governance operations be concentrated at the data-warehousing stage to ensure that information is clean and consistent before it's made available for analysis. In BI applications, he said, governance work should primarily involve monitoring data usage and adjusting policies if needed -- not putting up any big roadblocks for end users.
"If you prevent me from using a certain report, I have so many workarounds," Evelson said, pointing in particular to Excel spreadsheets -- still a common alternative to higher-level BI tools in many organizations.
Conference attendee Oana Garcia, a vice president and head U.S. data steward at Allianz Global Investors, said the investment management company is just getting started on a data governance program designed to make the use of data for analytics purposes more consistent. "We have different teams interpreting data in different ways," she said. "We need to make the data definitions more transparent."
Garcia, who is based in the company's New York office, added that the IT team is focused on making the data governance process useful to business users at Allianz. For example, instead of just writing policies on using data, it's looking to do things such as changing the column layouts in reports based on feedback from users. "We want to stay away from the theoretical and make it practical," she said.
BI project promotes data harmony
At TrueEx Group, a New York company that operates a financial exchange for interest rate-swap transactions, deploying BI software turned out to be a big step forward on data governance. In the past, tracking the data generated by TrueEx's trading platform was "a very manual process" that led to different departments using "slightly different numbers" for reporting and analysis, said David Hayman, the company's director of operations.
The BI tools, from software vendor Sisense, have helped to standardize things, according to Hayman. "The data is there, and it's solid," he said. "It was just a question of making it visible to users in a consistent view across the entire company."
Big data, including unstructured and semi-structured forms of information, adds new data governance issues to take into account. In Hadoop data lakes and other big data environments, "the way we think about data has to change," said Caserta, who is president of consultancy Caserta Concepts.
For example, he said that to maximize the business value of big data analytics applications, data scientists and other analysts need to be allowed to explore raw data sets in an unfettered way. Also, centralized master data management -- often a key component of the data governance process -- is more difficult with data captured from social networks and other "accidental systems," Caserta said.
But Caserta still puts a traditional data warehouse with fully governed and trustworthy data at the top of the BI and analytics pyramid. Information stored in a data warehouse "has to governed, has to be secure and has to have metadata associated with it," he said. "Then, it can go out to the mass of business users via BI tools."
More from Wayne Eckerson: A single data vocabulary helps boost BI efforts in companies
Consultant David Loshin says chief data officers should lead data governance processes
Some self-service BI and analytics tools get more data governance functionality
Learn more about IBM's data governance software