How to determine which data governance tool best meets your needs
A collection of articles that takes you from defining technology needs to purchasing options
Data, whether it's viewed as a potential liability to be managed or a valuable asset to be exploited, has entered the mainstream of enterprise activity. With organizations accumulating more and more data, and using it to drive more of their internal decision making, there's an increasing need to ensure its integrity so business executives and other users can be confident that they're working with accurate, consistent and high-quality data.
This is where data governance -- and with it, data governance software -- comes in.
Data governance isn't the actual management of data, but rather a formalized approach to managing a company's data assets. A data governance program establishes the rules of engagement for corporate data and information, in areas such as business alignment, policy, direction and oversight. It generally includes a governing body or council, plus a set of policies and procedures and a plan to execute them. The policies, which ideally should be jointly established by IT and the business, specify who is accountable for various aspects of an organization's data and the procedures that define how it is to be structured, stored and used.
Many companies implement data governance frameworks to address regulatory mandates, audit issues or a lack of data quality. Also, with network intrusions and cyberattacks on the rise, companies need to adopt more rigid oversight of their data for security reasons. While once the province of primarily compliance-driven industries such as financial services and healthcare, data governance programs are now being adopted by retailers, consumer product manufacturers and other businesses looking to better manage their growing volumes of business information.
The role of data governance software
While this series will look at the technology around data governance, that's just part of an overall governance strategy. It's often said that a successful data governance initiative comprises people, processes and technology, and data governance software is used to supplement and automate the processes put in place by organizations.
There's a tendency to think that data governance tools are just repurposed data management tools. While there's often some overlap in functionality, they aren't the same thing. For example, data governance tools don't create data layouts or, in general, executables. Instead, they support the various artifacts and moving parts of a program. This means traditional data management functionality such as glossary development and sophisticated administration of rules and policies.
A data governance program typically has a large number of components, and creating and managing its various mechanisms can be time-consuming. For example, just getting a governance policy set up or a key data element defined can take weeks. As part of a master data management or big data effort, you may be intent on governing 150 data elements, but you quickly discover that there are many moving parts and steps in data lifecycles -- and working through the data synonyms, semantics and politics is cumbersome.
To be fair, many of the data management tools a company already uses can assist with data governance. For example, a data dictionary tool or even data modeling software can be adapted to handle a data glossary that manages the definition of business data elements.
But the benefits of tools built expressly to support data governance are twofold:
- First, it's plain and simple productivity. Most organizations start with Excel spreadsheets and SharePoint lists to hold data definitions and information on data lineage and reference data relationships. In addition, they use Word, SharePoint or even wikis for documenting and updating governance policies. Typically, though, these sites then realize that the complexities of their data landscape will soon translate into an unwieldy set of standards, policies, procedures, definitions and workflows that can't be easily managed in spreadsheets and documents.
- Second, the tools used in governance initiatives need to support ongoing data discovery and sustainability, as it will take deliberate effort to deal with cultural issues and keep the program moving. Inevitably, the gaps between using a tool that sort of does data governance functions versus one designed specifically for data governance will become noticeable.
Some data governance platforms are adapted from prior incarnations of data management tools, while others are new and tuned to data governance from the ground up. While obtaining the functionality needed to support your data governance program is more important than buying a particular type of tool, in general, you can look at data governance software across the following three categories:
- Program and policy management, which include workflow, policy administration and issue management.
- Data quality, which includes data issue management and data quality remediation.
- Tools based on traditional data management functions, such as tracking of data lineage; data repository, semantics and glossary management; reference and master data management; and business rules governance.
Within these categories, though, you really need to consider functionality. The following list details essential aspects of data governance programs that can be automated and managed using governance tools:
Business alignment. It's key that a data governance program be aligned with an organization's strategy, goals, objectives and plans. Software functionality that supports taxonomies or hierarchies of business plans and processes is useful in fostering proper alignment.
Policy development. Data governance is a policy creator; therefore, facilities for driving the establishment of governance principles, policies, standards, controls, rules and regulations are valuable technology features.
Operational support. Data governance in action is an operational business program, which makes it important for program managers to be able to document and track governance roles and responsibilities, operating rules, workflows, data stewardship guidelines and organizational change management processes.
Management of information data artifacts. These include data elements, models and glossaries -- managing them is the core requirement for data governance software.
Controlling other artifacts. Documents and other materials that need to be stored for ongoing use or review as part of a data governance program can also be tracked and controlled with governance tools. Examples include manuals, charters and work products related to documenting roles, policies and data quality.
Data management elements. Since data governance is oversight of data management, you will need to track the various types of data management activities and standards. Data quality, domain and master data lifecycles, reference data, data movement, and data lineage are all processes that data governance will oversee and manage.
Collaborative workflow. Given the many moving parts and players involved, collaboration becomes a core theme in many data governance efforts. The movement of documents and facilitation of collaboration are ideal functions to automate within a data governance environment.
While there are numerous tools that can do some of those things, several products address multiple functions. The leading vendors based on market share include Adaptive, Alation, Collibra, Data3Sixty, Diaku, IBM, Informatica, Information Builders, SAP, SAS and Trillium.
Vendors such as SAS and information Builders have their roots in the business intelligence market. Other companies, including IBM, Informatica and SAP, have created data governance suites from products they either developed or acquired. Pure-play vendors like Collibra and Trillium specialize in data governance only. Most products are sold as licensed software, and many have a cloud or software as a service option.
Business and IT need to work together
The biggest issue around these tools is managing their acquisition, use and administration. Many data governance programs are sponsored and run within business operations; however, they require IT support for the governance software. Often, IT will evaluate and buy a tool, install it, then tell the data governance program managers it's ready. This isn't an ideal approach. Alternatively, a business unit may acquire a tool without an understanding of technology architecture, also creating problems.
For a data governance program to be successful, it's crucial that there be engagement between IT and the involved business areas on the deployment of data governance software. If these tools aren't used effectively, chances are the governance program itself will founder. Data governance is the oversight aspect of enterprise information management -- the key here being enterprise. There's no local data governance.
Data governance for self-service analytics
How to create a big data governance policy
A look at digital information governance