This article originally appeared on the BeyeNETWORK.
In Part 1 of this series of articles, I discussed various approaches to storing and managing the data associated with the five main types of business processes that exist in organizations. In Part 2, I continued this discussion by taking a detailed looked at handling master data, focusing specifically on the relationship between master data and operational business transaction processes. In this third, and final, article, I want to discuss the relationship between master data and business intelligence (BI) analytical processing.
Master data in an operational business transaction environment documents current information about an organization’s key business entities such as customers, locations, and products. In a BI environment, the master data documents how this business information has changed over time. In the case of customer master data, the BI system may record the various addresses customers have had, how their credit rating has changed over time, and so on. This historical master data can be combined with other historical business transaction data to produce analytical reports. Information about customers and their purchases could be used, for example, to produce reports identifying key customers, or for illustrating how customer buying patterns vary over time based on their credit rating.
Master data can also be used in a BI environment for forecasting. If a company is proposing to reorganize its sales regions, then it could create a new set of master data to reflect the new sales organization and combine this new master data with historical sales data to predict the effect of the changes on sales.
Managing Operational Master Data: A Reprise
In a traditional operational business transaction processing environment, master data is usually dispersed across multiple operational systems and intermingled with other types of business transaction data. As I discussed in my last article, there are significant benefits to maintaining a consistent and up-to-date view of operational master data across business transaction systems. There are three approaches to doing this:
- Propagate and synchronize master data changes between operational systems of entry so that all business transaction systems are kept consistent.
- Consolidate the master data from multiple operational systems of entry into a single master data store, which then acts as the system of record.
- Move the operational master data systems of entry to a new enterprise master data management (MDM) system.
As I mentioned in Part 2 of this series, it will take time for organizations to move to a full enterprise MDM system. Companies may start with Approach 1, and then gradually implement Approaches 2 and 3. In some cases, a fully compliant enterprise MDM environment may not be possible for political or technology reasons, and compromises may have to allow exceptions to occur. The long-term objective, however, is to evolve to an enterprise MDM environment.
Managing Master Data for BI Analytical Processing
How historical master data is managed in a BI system for analytical purposes depends on how that master data is managed in operational business transaction systems. Let’s examine each of the three approaches for operational master data management, and then look at how each approach affects the way master data is handled in a BI system.
Approach 1: With Approach 1, the operational master data is kept consistent, but it is still dispersed across multiple operational systems and intermingled with other types of business transaction data. This intermingled master and business transaction data can be extracted and integrated into a data warehouse in the same way that any type of operational data is brought into the BI environment. During the data integration process, however, data reconciliation should be simpler because the master data sources are kept consistent in the operational environment.
Another option with Approach 1 is to separate the master data from other business transaction data as it flows into the BI environment, and consolidate it in a historical master data store (MDS). Many companies start their MDM initiatives by building a historical MDS. Sometimes with these initiatives, no attempt is made to implement Approach 1 in the operational environment, and the master data is instead reconciled during the building of the historical MDS.
Approach 2: With Approach 2, operational master data is consolidated into a single operational MDS. This approach can be used in conjunction with Approach 1 or independently of it. If Approach 1 is not used, then master data reconciliation happens during the building of the operational MDS.
An operational MDS contains current master data that has a zero or low latency compared with the data in the operational systems of entry from which it came. The operational MDS becomes the system of record for its associated master data. In some projects, the operational MDS is used to correct errors in source systems if Approach 1 is not being used to reconcile operational master data.
For BI processing, there are two options for handling the master data in the operational MDS. The first option is to use the operational MDS as a data source for the data warehouse. The second option is to use a single integrated data store for both operational and historical master data – this can be thought of as a hybrid MDS.
Approach 3: With Approach 3, the enterprise MDM system acts as both the system of entry and the system of record for the master data. The master data store in this environment can be an operational MDS that becomes a data source for the data warehouse, or it can be a hybrid operational and historical MDS.
Where Should Historical Master Data be Managed?
As we can see, historical master data can be managed in a historical MDS, in a hybrid historical and operational MDS, or in a data warehouse. Which of these options is best is a controversial and sometimes hotly debated topic. This controversy is made more confusing by the participants often failing to distinguish between logical and physical concepts. Also, the fact that many companies have used their data warehousing systems to provide a single view of a business entity, such as customers, leads to a position where the data warehouse is often seen as a good place to start a master data project. In the longer term, this position may prove to be a bad one.
Managing master data is a logical data problem, rather than a physical one. As I have already mentioned in Part 2 of this article, master metadata, rather than master data, is the real issue here. The models and definitions of most business entities are very complex and are constantly changing as businesses evolve. Take a look at a data model of a customer entity, and you will see what I mean. Organizations need to track not only current master metadata and data, but also how this metadata and data change over time. This is not only for analysis purposes, but also frequently for legal reasons as well. A facility to track and record relationships between different business entities is also a requirement for many organizations. The ability to relate customers to the products they buy is a good example here.
Managing complex master data hierarchies and relationships is best done outside of the data warehousing environment. Current data warehouse design techniques (such as slowly changing dimensions, for example) may be able to handle subsets of master data and master data relationships, but they are totally inadequate for supporting a complete picture of an organization’s master data entities and relationships, and how they change over time.
Maintaining master data outside of the data warehouse environment also makes it easier for organizations to evolve to a full enterprise MDM environment with its own master data store (see Figure 1). This store can be used for managing both current and historical master data. In this environment, BI applications access master data from the hybrid MDS and historical business transaction data from a data warehouse. For ease of access and performance, a subset of the MDS data may be copied at regular intervals into the data warehouse. In a multidimensional data warehouse, this master data subset would be used to populate the dimension tables of the warehouse.
Figure 1: An Enterprise MDM Environment
In the future, some companies may evolve to having a hybrid MDS, an operational data store (ODS) environment for integrating current business transaction data, and a set of data marts for storing summarized and historical business analytics. In such an environment, an enterprise data warehouse may not be required because outside of the master data environment, most business transaction data is created, rather than updated. ATM and POS business interactions, for example, create transaction data, but this data is rarely updated. If it is updated, the changes may not be required for analysis purposes. This topic, however, is the subject of another article. I also make this comment somewhat tongue in cheek because it is, of course, controversial. The important thing here is to think outside of the box. The current status quo for business intelligence processing may not necessarily always be the best approach in the future.