This article originally appeared on the BeyeNETWORK.
In my recent article, I discussed how master and reference data are business problems first and foremost: simply, the business, not IT, is responsible for knowing, managing and keeping essential elements of the business (master data for customers, products, vendors, contracts, and reference data on organization and classification structures) correct and consistent. Now I’ll discuss how IT contributes to problems with master data.
Let me be blunt: decades of business and IT focus on applications and functionality with little or no regard to managing data separately from application systems have created an unmanaged data environment. In fact, it cannot even be called a data environment – data in nearly every business is simply an extension of its application systems. Addressing master data requires IT to separate data from the applications that use it.
This is not easy. Applications and their data structures are tightly coupled and lack the flexibility needed to separate them easily. This problem is aggravated by purchased software where software vendors strive to protect their intellectual property. Anyone who has acquired a major back or front office application knows that it is not easy to access a software system’s data stores to add, update, or delete data or alter physical data structures without affecting the integrity of the application. While a service-oriented architecture (SOA) is meant to decouple software units and data, it is clear that this is not happening fast enough to solve the master data problem, especially with purchased and proprietary software.
There are several reasons why tying a master data solution to a business application or platform is a risk:
Master data is used in many applications (according to a survey by TowerGroup, companies maintain master data separately in at least 11 or more source systems), so maintaining master data in one application system to serve all applications that use it will likely overload the throughput capability of the chosen application;
Different applications use different subsets of master data, so the chosen application will need to be modified to store all master data attributes needed by all the applications in the business;
Modifying an application may affect the ability to keep the application current with application upgrades and enhancements, especially if the application is purchased (I have on more than one occasion worked with a client in the unfortunate circumstance of being unable to upgrade to a new software release because of the time and cost associated with reapplying custom “enhancements” already implemented and used in the business – a situation that festers until the application itself must be, at great cost, replaced or upgraded with significant rework);
This complicates the future of your master data, especially if the vendor goes out of business, the application is superseded, sold, or simply dropped, or if the business unit that uses the application wants to replace it for business reasons.
It isn’t as if the tactic of making an application “strategic” has worked before. Many companies have had initiatives to develop a customer, product, or other “master file,” many more than once; but because data has not been separated from applications, the problem reappeared overnight.
IT contributes to problems with master data because IT treats a master data solution as another application. When master data is tackled, it is typically as a customer data integration (CDI) or product information management (PIM) application. This adds another application process, albeit one that can resolve some issues with subsets of master data, into the application infrastructure of IT.
The overall impact of this approach is problematic because implementing the solution as a new application continues the process of correcting data after it has been entered into a transaction processing system. “Master data” constructed in this way is not always the correct, accurate, complete and, for all purposes, official information because it is updated after one or another transaction processing system. A true master data solution ensures that the master data repository is the only place where master data is added, changed or deleted.
This aspect fundamentally changes the interaction between master data and applications that use it. Applications will not add, update or delete master data records. Instead, they will need to use an API to invoke a data delivery service for master data to perform add, update or delete functions in the master data repository. If master data is physically required to reside in an application system, it is copied directly from the master data repository, never added independently to the application’s data store.
While this will create application problems, it is necessary for success because master data is an enterprise issue and must be handled as such. When decisions are made at the division or department level, master data cannot be solved by either the business or IT. Fundamentally, the process for making these decisions, called governance, needs to be addressed in order to deal with master data effectively.
However, master data for now is stuck inside a wide array of application systems where it is often inconsistent, incorrect and unreliable. Yes, distributed or federated master data is a data problem, and developing a comprehensive master data solution must tackle these data problems head on. This requires dealing with the quality and integration of data in application systems, a process which is well defined and supported by products, sources of business data external to the company, and other resources to supplement and improve a business’s in-house data.
These processes and technologies for data quality and integration have been promoted by vendors as the solution to master data management, but this is not the case. The full extent of the data problem for master data goes beyond data quality and integration – it begins with information architecture. Consider the following issues and questions:
Platform and Logistic Issues
Where will the master data repository reside? How will master data content be delivered to the applications, data warehouses and marts, and business users who need it? Will all technical platforms used for these be supported; and if not, how will master data inconsistencies be avoided?
Data and Content Issues
Which data attributes will be included in master data and which will be left in application systems? Why are the selected attributes considered master data (centralizing the data is not a valid reason)? Is unstructured data required?
Master Data and Application Integration Issues
When an application needs master data to be in its data store, how will that data be kept consistent with the master data repository? When applications “share” or transmit master data to one another, how will they be kept in sync across the business? When these applications have identifiers and keys different from each other and/or from the master data repository, how will the data be kept correct even when different keys are being used?
Data Delivery and Security Issues
Master data will be some of the most sensitive data in the business, so how will access to it be controlled? Will master data be encrypted? Will its transmission to an application be encrypted? What protocols will be used for data delivery services and authorizations used for master data?
Data Audit, Control and Compliance Issues
What audit processes will be put in place to ensure that the master data repository is the official, correct version of master data and that no exceptions exist in application data stores? What processes and controls are in place to ensure that master data is only accessed by authorized people and applications? How is regulatory compliance, such as for financial or patient data, ensured for appropriate master data?
Master Data Management Issues
Who is accountable for managing master data? How is master data managed? How is the usage of master data by businesspeople, applications and IT controlled? How is master data kept current and correct? How is master data reconciled with corresponding data values in application data stores?
Only by developing answers, processes and structures for these issues and questions will a comprehensive master data solution be possible. Once master data has been architected into the information framework of the business, a successful master data solution can begin to be developed.
In my next article, I’ll conclude this series by discussing how to be successful with master data.
Until then, let me know what you’re thinking. You can e-mail me at firstname.lastname@example.org.