This article originally appeared on the BeyeNETWORK.
In Part 1 of this article, I looked at data integration from the perspective of the five main types of business processes that exist in organizations and reviewed the various ways the data associated with those processes can be stored and managed. In this article, I want to continue this discussion by taking a detailed look at storing and managing the data associated with master data management processes.
Master data management (MDM) is defined as, “A set of disciplines, applications, and technologies for harmonizing and managing the systems of entry and system of record for the data and metadata associated with the key business entities of an organization.”
Before proceeding, it is important to define the terms system of entry and system of record. A system of entry is an application system that is responsible for creating and maintaining the master data associated with one or more specific key business entities (customers, products, assets, etc.). The system of record is the application system that is responsible for publishing a single integrated view of enterprise-wide master data and ensuring its accuracy.
Dealing With Multiple Systems of Entry
In a fully compliant enterprise master data management environment, the systems of entry and the system of record are all the same system. This eliminates data redundancy and improves data quality and consistency. Most organizations, however, don’t have an enterprise master data management environment. Instead, master data is maintained by multiple disparate business transaction applications, and there are multiple systems of entry for any given type of master data such as customer data. To create a single integrated view of the master data, the system of record has to gather and integrate master data from each system of entry.
One of the biggest difficulties when there are multiple systems of entry is maintaining data quality and consistency. This would be easier if the systems of entry employed the same data quality services as the system of record. Again, this is not the case in most organizations because of the wide range of custom built and packaged business transaction applications that have been deployed independently of each other.
To overcome master data quality and consistency issues, many companies have begun integrating dispersed master data into a low-latency operational data store (ODS) for business transaction (BTx) operational applications, and historic detailed and summarized master data in a data warehouse for business intelligence (BI) reporting and analysis applications (see Figure 1). In some cases, the ODS becomes a data source for the data warehousing system and its associated BI applications. Creating a single view of the customer is a typical application for this type of integration.
Master data in an ODS and a data warehouse is intermixed with standard business transaction data such as account debits and credits, parts orders, and so forth.
Figure 1: Integrating Master Data for BI Processing
The master data provides the access paths to the standard business transaction data. It allows customer orders, for example, to be grouped and summarized by product number, salesperson, and region. In a star schema design, the master data is the dimensional data and the standard business transaction data is the fact data. This approach to integrating master data provides several benefits, but it doesn’t solve all of the problems companies have with master data.
Handling Master Data Complexity
Master data is not as volatile or voluminous as standard business transaction data. Companies don’t have hundreds of new customers or bill of material changes per second. Telephone companies and large retail organizations have millions of customers, but even here the customer master data is still relatively small compared to the amount of standard business data created by those customers.
The biggest problem with master data is complexity. The definition of a customer, for example, is very intricate in most organizations. Customer relationships and roles frequently vary depending on which part of the organization a customer is dealing with. These relationships and roles are constantly changing as companies merge and reorganize. Companies want to track these changes over time so that they can analyze the impact of reorganizations or product changes. They also want to understand the relationships between different types of master data, such as customer and products, for example, and how they affect the business.
Master data complexity is really a metadata issue, rather than a data problem. Master data business models are constantly being modified. Some companies have hundreds of model changes each month. They can try to track these changes using techniques such as slowly changing dimensions, but such approaches are simply band-aids that rarely allow companies to see the complete picture about any given key business entity or its relationships to other entities over time.
The solution to master data complexity problems is to create a single enterprise master data management system that acts as both the system of record and system of entry for all master data. This system then supplies master data to business transaction and business intelligence applications as required.
Creating an enterprise MDM is a big undertaking for most organizations and requires several years to complete. In some cases, building a single MDM system may not even be possible because of the number of legacy systems that would have to be changed, for political reasons, or because some aspects of master data management have been outsourced. Nevertheless, all companies should have a long-term objective to build such a system and a plan to iteratively move in that direction.
To understand how to move iteratively toward an enterprise MDM system, we need to examine how master data is processed by operational business transaction applications and analyzed by business intelligence applications. We also need to consider the three time periods in the life of metadata: the past, the present, and the future.
Supporting Current Operational Master Data
Let’s start with the present, or current, master data. This type of master data is mainly associated with operational processing applications that need to see a consistent and up-to-date view of the master data. There are three ways of doing so:
- Propagate and synchronize master data changes between systems of entry so that all systems are kept consistent.
- Consolidate the master data from multiple systems of entry into a single master data store.
- Move the systems of entry to a new enterprise MDM system.
The first approach is often performed using a master data hub that moves the data asynchronously between systems. The hub imposes a set of business rules that are designed to enforce data consistency and data quality. The issue here is coordinating the business rules between the different systems of entry and the hub. This approach works for a single business entity and a small number of systems of entry, but breaks down in larger environments because of the complexity involved.
The second approach is an evolution of the ODS concept where the ODS is split into two components. One component contains standard business transaction data, and the other component contains master data. The master data component becomes a master data store (MDS) and the system of record for the master data. The MDS can be used to feed downstream systems, for supporting new operational applications, and as a data for source business intelligence processing.
Techniques for building an ODS can similarly be employed to create an MDS. I have seen several companies split their existing ODS implementations into these two components. Some master data hub products also provide the option to create an MDS. The issue here, of course, is the latency of the data in the MDS compared with systems of entry.
The third option is to move the systems of entry to a new enterprise MDM system. When a system of entry cannot be moved to the MDM system, then one of the two other approaches can be used to synchronize the master data in systems of entry with the enterprise MDM system (see Figure 2).
Figure 2: Enterprise MDM Operational Processing
An evolutionary approach to MDM for operational processing is to start with approach 1 or 2, and then gradually evolve to approach 3.
Supporting Past and Future Master Data for Business Intelligence
In addition to maintaining the consistency of master data for operational processing, companies also want to use past, or historical, master data in business intelligence processing. Keeping a historical record of master data changes may be required, for example, for legislative reasons. This is especially true for financial master data. Historical master data can also be used to evaluate and analyze how organizational changes, product mixes, mergers, and so forth affect business operations and results.
The concept of future master data really applies to metadata and the master data business model. Companies may want to evaluate and forecast, for example, how business model changes could affect future business results. Reorganizing sales regions would be an example here. Forecasting works by applying different master data metamodels to existing master data to determine the model that works the best.
The big question with handling past and future master data is whether the existing data warehousing environment can be used to handle this kind of master data processing and analysis. An alternative approach is to keep the historical master data in the MDM system instead. This is a controversial and complex discussion, and it is important that an evolutionary rather than revolutionary approach be used. We will examine this issue in Part 3 of this article.