The following excerpt from Enterprise Knowledge Management: The Data Quality Approach, by David Loshin, is printed with permission from Morgan Kaufmann, a division of Elsevier. Copyright 2001. Click here to read the complete Chapter 2: Data governance: Information ownership policies and roles explained.
This section explores the issues regarding the distribution of ownership across an enterprise. The question of centralization versus decentralization is orthogonal to the responsibilities of ownership, yet it is distinctly intertwined with it as well. As in our ownership model, the critical point revolves around value.
In a centralized ownership model, there is a single entity (person or group) responsible for all data ownership for the entire enterprise. Centralization implies that all ownership activities are coordinated from a single point of control, as well as coordination of metadata, information sourcing, and so forth. Centralized ownership yields the benefit of the value added — and whether the costs associated with centralization are offset by it. The costs include the increased management overhead, bureaucracy, and system integration, among others. The benefits include enterprise standardization for data and systems, the ability to make use of merged data for additional knowledge discovery, and increased leverage when dealing with external data suppliers.
In a decentralized model, the ownership roles are allocated to separate areas of interest. A decision to opt for decentralization implies that the value added from centralized control is more than offset by its associated costs. On the other hand, most organizations do not explicitly opt for decentralized control; instead, organizations evolve into it. Therefore, the real question is whether migrating from a decentralized ownership model to a centralized ownership model will increase the value of the enterprise knowledge base.
Finding the answer to this question is not simple. It involves a process of identifying the interested parties associated with all data sets, determining each party's interest, identifying the different roles associated with all data sets, and assigning roles and responsibilities to the right parties. All of these activities are embodied in an organization's data ownership policy, which incorporates all governance rules regarding data ownership and usage within the enterprise.
Creating a Data Ownership Policy
A data ownership policy is a tool used by the enterprise to establish all roles and responsibilities associated with data ownership and accountability. The goal of a data ownership policy is to finesse the kinds of complications discussed in Section 2.2, as well as hash out the strict definitions of ownership as described in Section 2.4. The data ownership policy specifically defines the positions covering the data ownership responsibilities described in Section 2.3. At a minimum, a data ownership policy should enumerate the following features.
- The senior level managers supporting the enforcement of the policies enumerated
- All data sets covered under the policy
- The ownership model (in other words, how is ownership allocated or assigned within the enterprise) for each data set
- The roles associated with data ownership (and the associated reporting structure)
- The responsibilities of each role
- Dispute resolution processes
- Signatures of those senior level managers listed in item 1
A template for describing the ownership policy for a specific data set is shown in Figure 2.1.
|Data Set Name|
|Data Set Location|
FIGURE 2.1 Template for data ownership policy
These are the steps for defining a data ownership policy.
- Identify the interested parties or stakeholders associated with the enterprise data. This includes identifying the senior level managers that will support the enforcement of the policy.
- Catalog the data sets that are covered under the policy.
- Determine the ownership models in place and whether these are to continue or will be replaced.
- Determine the roles that are and are not in place. Assign the responsibilities to each role, and assign the roles to interested parties.
- Maintain a registry that keeps track of policies, data ownership, roles, responsibilities, and other relevant information.
Identifying the Stakeholders
All stakeholders in the information factory, including all the actors delineated in Section 2.1.1, should be considered interested parties. A stakeholder is anybody who expects to derive some benefit or value from the data, whether it is through the use of the data, the sale or license of the data, or beneficially through association with the data. For example, a business customer who uses the reports gets value through the data, receives monetary compensation through the sale or license of the data, and benefits from the jobs that may be dependent on continued data center operations and application development.
In a small enterprise, stakeholder identification can be relatively simple, but as the enterprise grows, the process can become extremely complex due to the degrees to which information is processed and disseminated. A good heuristic is to begin from the outside of the enterprise and work in. In other words, figure out who the end users are, look at the data they are using, and follow it backward through the information chain. While some business users may be outspoken in terms of staking their claim, others may be blind to the fact that there is any organizational process that generates the paper reports that land on their desks. Also, just because people receive the reports, they may never look at the data provided on a periodic basis.
The process of identifying the stakeholders will likely reveal areas of conflict with respect to data ownership. This is a particularly valuable part of the process, as it provides a guide to deciding how the ownership responsibilities are assigned.
Cataloging Data Sets
Once the stakeholders have been identified, the next step is to learn what data sets should fall under the ownership policy. The stakeholders should be interviewed to register the data sets with which they are associated and the degree to which each believes his or her stake in the data is. The goal of this step is to create a create a metadatabase of data sets to use in the enforcement of the data ownership policies. This catalog should contain the name of the data set, the location of the data set, and the list of stakeholders associated with the data set. Eventually, the catalog will also maintain information about data ownership and responsibilities for the data set.
Identifying and Assigning Roles
The next step is to determine the roles that are associated with each set of data in the enterprise and describe the responsibilities of each role. Here are some examples, although this list is by no means meant to be exhaustive.
- Chief Information Officer The CIO is the chief holder of accountability for enterprise information and is responsible for decisions regarding the acquisition, storage, and use of data. He or she is the ultimate arbiter with respect to dispute resolution between areas of ownership and is the ultimate manager of the definition and enforcement of policies.
- Chief Knowledge Officer The chief knowledge officer is responsible for managing the enterprise knowledge resource, which dictates and enforces the data sharing policies, as well as overseeing the general pooling of knowledge across the organization.
- Data Trustee The data trustee manages information resources internal to the organization and manages relationships with data consumers and data suppliers, both internal and external.
- Policy Manager The policy manager maintains the data ownership policy and negotiates any modifications or additions to the data ownership policy.
- Data Registrar The data registrar is responsible for cataloging the data sets covered under the policy as well as the assignment of ownership, the definition of roles, and the determination of responsibilities and assignments of each role. The data registrar also maintains the data policy and notifies the policy manager if there are any required changes to the data ownership policy.
- Data Steward The data steward manages all aspects of a subset of data with responsibility for integrity, accuracy, and privacy.
- Data Custodian The data custodian manages access to data in accordance with access, security, and usage policies. He or she makes sure that no data consumer makes unauthorized use of accessed data.
- Data Administrator The data administrator manages production database systems, including both the underlying hardware and the database software. The data administrator is responsible for all aspects related to the infrastructure needed for production availability of data.
- Security Administrator The security administrator is responsible for the creation of and the enforcement of security and authentication policies and procedures.
- Director of Information Flow The director of information flow is responsible for the management of data interfaces between processing stages, as well as acting as an arbiter with respect to conflicts associated with data flow interfaces.
- Director of Production Processing The director of production processing manages production processing operations, transference of data from one production source to another, scheduling of processing, and diagnosis and resolution of production runtime failures.
- Director of Application Development The director of application development manages requirements analysis, implementation, testing, and deployment of new functionality for eventual turnover to the production facility.
- Data Consumer A data consumer is an authorized user that has been granted access rights to some data within the enterprise.
- Data Provider A data provider is an accepted supplier of information into the system.
These roles will then be integrated into a reporting structure where there are clear lines of responsibility corresponding to degrees of ownership. Note that some responsibilities are assigned to multiple roles, causing "role overlap," whose governance must be integrated into the reporting structure as well. At this point, the senior manager responsible for information (typically a chief information officer) will then assign ownership roles and responsibilities to the different organizational stakeholders.
The ownership registry is created from the data catalog and the assignment of roles. It is the enterprise log that can be queried to determine who has the ultimate responsibility for each data set. The ownership registry should be accessible by all interested parties, especially when new data requirements arise or there is a conflict that needs resolution.
Management of the ownership registry requires keeping a pulse on the organization, as it is not unusual for employee turnover to affect the data management structure. In addition, as new data sets are added to the governance by the data ownership policy, the decisions regarding the new data must be added to the registry.
- Continue reading the rest of Chapter 2: Data governance: Information ownership policies and roles explained.
- Go back to read Chapter 1: Data quality management: Problems and horror stories.
- If you have a data quality or governance question, ask our data quality expert, Craig Mullins.
- Read more sample chapters on data quality management and other data management topics at our Chapter Download Library.