BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Data governance (DG) is the overall management of the availability, usability, integrity and security of data used in an enterprise. Businesses benefit from data governance because it ensures data is consistent and trustworthy. This is critical as more organizations rely on data analytics to make business decisions, optimize operations, create new products and services, and improve profitability.
A well-designed data governance program typically includes a governance office, a governing body or council, a defined set of procedures and a plan to execute those procedures. It also involves representatives from an organization's business operations, in addition to the IT and data management teams.
Why data governance matters
Without effective data governance, data inconsistencies in different systems across an organization might not get resolved. For example, customer names may be listed differently in sales, logistics and customer service systems. That could complicate data integration efforts and create data integrity issues that affect the accuracy of business intelligence, enterprise reporting and analytics applications.
Poor data governance can also hamper regulatory compliance initiatives, which could cause problems for companies that need to comply with new data privacy and protection laws, such as the European Union's GDPR and the California Consumer Privacy Act (CCPA). An enterprise data governance program typically results in the development of common data definitions and internal data standards that are applied in all business systems, boosting data consistency for both business and compliance uses.
Data governance goals and benefits
A key goal of data governance is to break down data silos in an organization. Such silos commonly build up when individual business units deploy separate transaction processing systems without centralized coordination or an enterprise data architecture. Data governance aims to harmonize the data in those systems through a collaborative process, with stakeholders from the various business units participating.
Another data governance goal is to ensure data is used properly, both to avoid introducing data errors into systems and to block potential misuse of personal information and other sensitive data. That can be accomplished by creating standard policies and procedures on data usage, along with processes to monitor and enforce them on an ongoing basis.
Besides more accurate analytics and stronger regulatory compliance, the benefits of a successful data governance program include improved data quality, lower data management costs and increased access to critical data for data scientists, other analysts and business users. Ultimately, data governance can help drive better business decision-making, because executives have better information at their disposal.
Components of a data governance framework
A data governance framework consists of the rules, processes, organizational structures and technologies that are put into place as part of a governance program. In many cases, the organizational elements include a data governance office, with the program manager and other employees who help run the program; a data governance council made up of business executives that sets data usage rules and governance policies; and data stewards, whose responsibilities include overseeing data and ensuring the rules and policies approved by the council are implemented.
Data governance software tools that automate aspects of managing a governance program are available from various vendors. While they aren't a mandatory framework component, governance tools support collaborative workflows, policy development, data catalogs and other functions. They can also be used in conjunction with data quality, metadata management and master data management (MDM) tools.
Data governance implementation
The initial step in implementing a data governance framework involves identifying the owners or custodians of the different data assets across an enterprise and getting them or designated surrogates involved in the governance program. Also, workers with knowledge of particular data assets are appointed to handle the data stewardship role. That's a full-time job in some companies and a part-time position in others; there can also be a mix of IT and business data stewards.
Processes must then be defined to cover how the data will be stored, archived, backed up and protected from mishaps, theft or attacks. A set of standards and procedures must be developed that define how the data is to be used by authorized personnel. Moreover, a set of controls and audit procedures are needed to ensure ongoing compliance with internal data policies and external government regulations and guarantee data is used in a consistent manner across different enterprise applications.
The data governance teams formed to implement the policies and procedures for handling data can comprise business managers, data managers and IT staffers, as well as end users familiar with relevant data domains within the organization. Associations dedicated to promoting best practices in such data governance processes include the Data Governance Institute, DAMA International and the Data Governance Professionals Organization.
Data governance challenges
Often, the early steps in data governance efforts can be the most difficult, because it's characteristic that different parts of an organization have diverging views of key enterprise data entities, such as customers or products. These differences must be resolved as part of the data governance process -- for example, by agreeing on common data definitions and formats.
To the extent that data governance may impose strictures on how data is handled, it can become controversial in organizations. A common concern among IT and data management teams is they'll be seen as the "data police" by business users if they lead data governance programs. To promote user buy-in and avoid resistance to governance policies, experienced data governance managers recommend that programs be business-driven, with the governance council making the policy decisions.
Another common challenge is demonstrating the business value of a data governance program. Doing so requires the development of quantifiable metrics, particularly on data quality improvements -- for example, the number of data errors resolved and the revenue gains or cost savings that result from them. Ongoing communication with business users about the progress of a data governance program is also a must, via a combination of reports, email newsletters, workshops and other outreach methods.
Key data governance pillars
Data governance programs are underpinned by several other facets of the overall data management process. Most notably, that includes the following:
- Data stewardship. An essential responsibility of the data steward is to be accountable for various portions of an organization's data. Teams of data stewards typically are formed to help guide and execute data governance implementations. These teams may include database administrators, business analysts and business personnel familiar with specific aspects of data within the organization. Data stewards work with other individuals positioned in the overall data lifecycle management process to help ensure data use conforms to a company's data governance policies.
- Data quality. Data quality improvement is the driving force behind most data governance activities. Accuracy, completeness and consistency across data sources are the crucial hallmarks of successful governance initiatives. Data scrubbing, also known as data cleansing, is a common data quality element; it identifies, correlates and removes duplicated instances of the same data points. Data scrubbing accounts for the various ways in which, for example, the same customer or product may be described. Data editors, data mining tools, data differencing utilities and data linking tools, as well as version control, workflow and project management systems, are included among software types that help organizations attain better data quality.
- Master data management. Data governance touches on nearly every aspect of data management, but one area of data management very closely associated with data governance processes is MDM. This is a discipline that establishes a master reference to ensure consistent use of data across large organizations. Metadata repositories, which hold data about data, are often used in establishing cross-group reference data in MDM programs. Product and customer data is a major emphasis of MDM systems. As with data governance generally, master data management projects can encounter controversy within organizations, as different product groups or lines of business promote different views on how to best present data.
- Data governance use cases. Data governance is a particularly important component of mergers and acquisitions, business process management, legacy system modernization, financial and regulatory compliance, credit risk management, analytics, business intelligence applications, data warehouses and data lakes. As data uses expand and new technologies emerge, data governance likely will gain even wider application. For example, efforts are underway to apply data governance processes to the machine learning algorithms that organizations increasingly are using in analytics applications. Also, high-profile data breaches have made data security a more central part of data governance efforts, with calls for stronger data privacy leading to the inclusion of data protection and data privacy audits in governance programs. Compliance with the GDPR and CCPA privacy directives is another example of a new use case for data governance.