This article originally appeared on the
Anyone currently using SAP R/3 is already familiar with the term master data. Many are wondering if SAP’s latest offering of Master Data Management (a.k.a. MDM) is yet another clever marketing ploy – a repackaging of an existing SAP capability, with a “new and improved” sticker pasted on the front.
Master data management is not a new term or concept – however, SAP’s approach is somewhat novel and is coming to the forefront of SAP business executives more and more. Their IT staff is mentioning it, they are receiving e-mails linked to articles about it and they’re seeing advertisements for conferences about it.
As with most things SAP, confusion abounds and "experts" are falling out of the trees, each trying to position themselves for the coming "MDM boom."
In this article, we will examine the technologies leading to MDM (including SAP’s). Knowledge learned in this article can be used by:
- Decision makers wanting to know more about SAP’s MDM solution
- Anyone desiring an understanding of how data information technologies are related
- Anyone desiring an understanding of how "unmanaged" data impacts their organization
MDM, CDI, EIM, RDM, SRM, PIM, CRM – What the DF is Going On?
Does the sight of these three-letter acronyms make your eyes glaze over?
For most executives, this is the case. When they ask their IT department for explanations, many of their own people fall back on what they’ve read in brochures or seen in a vendor’s Web seminar.
Decision makers want straight answers to simple questions – without the marketing spin or deliberate obfuscation. They want to know not only what something is, but how it relates to their current challenges – in other words: Should I invest my time and resources investigating further toward a potential buy / no-buy decision?
In this article, we’re covering SAP’s Master Data Management (MDM) – but to do so properly, we must also include some of those other acronyms – since MDM (in general) is really an integral piece of the others.
Who Let that Elephant into the Room?
What drives data innovation within the commercial and not-for-profit sectors? The servicing of their end users, be they internal decision makers or customers. To do so accurately and competitively, you need to maintain and manage all associated data.
From the advent of databases to verticalized data marts, a recurring problem has been seen, but largely ignored – how do you get a sales mart to speak with a finance mart that needs to speak with a best-of-breed HR database? Getting them to physically speak (on the networking side) hasn’t been the overriding issue – it’s how to ensure the data in these separate data silos isn’t being replicated (or isn’t replicated because of misspelling) and enforcing the need for data fields to share a common meaning. In other words, does adollar = a dollar throughout the enterprise?
For example, the finance mart may use the term Customer – this may mean only external sales customers to the finance department. But the sales department may list in their sales mart multiple types of customers: internal customers (i.e., one division selling to another), external customers, sold-to customers, ship-to customers, etc. What was done to overcome this inherent drawback? Think of a data warehouse.
One of the first workable answers for large organizations (ones having multiple and siloed data marts) was to bring those marts together under one roof.
Exit the data mart – enter the data warehouse.
There are various ways of viewing what and how data warehouses came into being; but for the purposes of this discussion, let’s use the following: a data warehouse is essentially multiple data marts – all separate pieces making up the whole. Within the warehouse, you have your finance, sales, HR and other marts – but they are connected by a common data (table) structure and share common terms. Getting the data into the warehouse is accomplished via an ETL tool. An ETL application (extract, transform and load) takes data from a data source (the extract process), eliminates duplicate records and forces common field terms and data lengths, etc. (a portion of the transform process), and finally loads the data into the appropriate portion of the warehouse (sales info is still somewhat segregated from other data). However, you may now analyze the data, looking for trends, etc.
Those having SAP R/3 already share in these benefits; the sales module (SD) speaks to the finance and controlling modules (FI/CO) and so on. They also share common names, customer master data and so on. For many, R/3 is all you need to compete.
Going one step further is SAP’s Business Warehouse (BW). You don’t need SAP R/3 to use SAP BW, but its ETL application (which is built into BW) was designed to specifically extract, transform and load R/3 data into the Business Warehouse. Extensive analytical reporting is commonplace with BW data – and many clients ETL transactional data into their BW systems (sometimes hourly) for standard reporting – something R/3 was designed to do.
Customer relationship management (CRM) is an outgrowth of today’s data warehouses. It brings all the relevant customer data into a single focus point. From here, users of CRM can spot trends as they happen (and preferably before they happen) and follow preset metrics for target goals and time-to-respond campaigns, etc. Extensive analytic reporting is vital to CRM proficiency.
And how did CRM come about? Well CRM is an outgrowth of the next topic.
Enter CDI, DF, PIM, RDM and Finally MDM
CRM was not the first attempt to bring customer data into a coherent form; there was, and still is, customer data integration (CDI). CRM owes its existence to CDI.
The concept behind CDI was to bring a "single view" of a user’s customer base – in other words, integrate or consolidate customer data into a concise and reachable vantage point. This includes creating and maintaining an accurate, up-to-date and synchronized view of a customer across multiple business channels, business lines and enterprises. It strives to bring together data from multiple sources in various applications and databases.
The definition of CDI looks suspiciously like CRM. And, it also comes closer to MDM when you consider both CDI and CRM covet "master customer records."
The concept of CDI was the natural outgrowth of such organizations as marketing service firms. These firms employ massive databases and sell their data to various concerns, such as credit card companies. One of the largest and an early innovator in CDI was the business information company Experian – whose North American subsidiary is also a credit reporting agency. Early on, Gartner coined the term CDI and the market analysts who made use of this information codified it.
Companies such as Experian used CDI to leverage their processes to sort, aggregate and clean their vast databases of consumer and business information – turning the terabytes of data into usable information for direct sales, credit card use, etc. These newly aligned processes allowed their customers to quickly and more accurately recognize and identify potential and recurring customer tendencies, cyclical trends and buyer behavior. Now their clients not only knew who you were and where you lived, but also what you purchased, where it was purchased, how often, who within your family did the purchasing, how much you typically spent and, don’t forget, how you managed to pay your bills. Risk assessment has become a large part of CDI – therefore, you can also thank CDI for helping to bring you your credit score.
Source: CDI Institute
In late 2002 and early 2003, CRM came into existence – it was CDI refined – a 4th generation CDI (see Figure 1). CRM was, however, much more than marketing spin – it was a direct attempt to bring the philosophy of CDI to enterprise clients – in other words, to everyday businesses (not just those directly selling to the public). Now any organization, including governments, could put the benefits of CRM to use.
SAP was one of the first to introduce CRM into their SAP R/3 and BW systems. As stated, one tremendous advantage R/3 has held over its competition (even today) is the R/3 architecture – CRM requires a sophisticated customer master database – something SAP R/3 perfected in the early ‘90s and SAP R/3 users are very familiar with it.
CDI is still used extensively throughout business – but how it is used (in contrast to MDM) will be examined shortly.
What About the Others?
Before we launch into MDM, let’s quickly review the other acronyms: SRM, DF, EIM, PIM and RDM.
What is SRM?
Supplier relationship management (SRM) is the flip side to CRM – dealing with suppliers versus customers. Within SAP, there was once a separate application called EBP (Enterprise Buyer Professional), but EBP now follows the branding focus of CRM, and hence the renaming to SRM EBP. To confuse things further, MySAP SRM also exists. This is the overarching SRM application – using SRM EBP for key processes feeding into MySAP SRM. Finally, SRM EBP is bundled with MySAP.com – whereas MySAP SRM requires additional licensing – but you should have already configured SRM EBP prior to advancing to MySAP SRM.
SRM, like CRM, gives a users a 360º view, but of your suppliers (from a procurement source and service provider point of view). For example, MySAP SRM offers the ability to have "live" (be it virtual) auctions – where the highest bidder can be found for a specific RFP or RFQ. You can also manage your reorder scheduling, leverage internal supplier assessments/reviews and product catalogue searching (via the Open Catalogue Interface – OCI).
What is Data Federation (DF)?
Data federation may be considered a client-manufactured über view of your business data – rather than an actual data source. Its purpose is combining disparate data into a single, focused vantage point, allowing users to use data regardless of where the data originated. However, unlike a data warehouse, DF requires outside applications – and with some vendor solutions, dedicated hardware as well.
Data federation, in concept, is to take various data sources and data types and bring them into one place. Like a view, users don’t know where the data resides, nor do they need to know. Using their business intelligence reporting tools, they can query the data as if it were a single data source. Data federation uses aliases (or nicknames) within the metalayer – giving meaningful and consistent names to common fields (such as customer). It also refines the data into a common format. Data harmonization is the term used.
Figure 2: Screenshot of an IBM WebSphere data federation modeling tool for developing a virtual schema
Also, data federation "retrieves" the data from the data source(s) on an as-needed basis. Due to this "real time" access of the data, users experience limited exposure to old data (the time between updates is called the latency period) – furthermore, all necessary data transformations are accomplished on the fly as well. This requires a sophisticated messaging and transformation system – the data federation engine. Some data federation applications or in-house projects use off-the-shelf components, typically applications they are already familiar with such as using Informatica as the ETL component (to retrieve and transform the data) and a standard BI application (such as Cognos, SAS or BusinessObjects) for generating the initial query and providing the formatted report.
Initially, data federation was looked upon and sometimes positioned as a possible replacement to the data warehouse. Today’s common use of data federation is to complement existing data warehouses – leaving the warehouse for specific aggregation and analytic functions, while using the data federator as the data access umbrella over the enterprise. Using today’s inherent relational database abilities to behave as a type of middleware, the use of wrappers is becoming prevalent in data federation. Wrappers permit access to specific types of data sources and respect data access protocols. Furthermore, they may contain information regarding the characteristics of the accessed data. Wrappers can be developed and provided by the data federation application firm, by third-party companies or developed in house.
What is EIM?
Enterprise information management (EIM) may also be known to you as enterprise content management (ECM).
Like CDI, EIM brings disparate data sources together, permitting users to have a single view of the data. However, where EIM diverges from CDI is its focus on data quality. Strict compliance to metadata policies is a prerequisite for EIM to be successfully employed.
Depending on the vendor or in-house approach, an EIM initiative may include the following: ETL, data quality assurance (i.e., cleaning the data), a metalayer (common names and formats), and an ETL application (to retrieve and transform the data), as well as a BI application (this may include report distribution, report viewing and a security layer as well as a scheduling component).
Each of these components may already exist within an organization, but have yet to be integrated into an EIM solution. Those organizations wishing to do so will find themselves having to implement data governance – this means establishing procedures and policies to ensure all aspects of the organization understand and adhere to codified data entry and storage standards. Organizations not employing data governance will never achieve ROI on their EIM initiative and will find themselves with an excessive total cost of ownership (TCO). For EIM to succeed, the data must be trusted, up to date and accessible.
What is PIM?
Product information management (PIM) is analogous with other like applications/solutions:product resource management (PRM), product data management (PRD) and one well known to SAP customers, product life cycle management (MySAP PLM).
Coming to the marketplace in late 2004, PIM finds itself being used in the consumer goods and retail industries. It essentially brings together various product management tasks, coupled with media collaterals (such as brochures, flyers, online stores, etc). Like CRM and SRM, PIM brings relevant data sources together – regardless of where they may reside – for a single view of the information. This means the sales, engineering and marketing departments have access to the same product information, eliminating silos of functional data. This data consolidation helps firms be more responsive and provide up-to-date and accurate information to their customers.
This is especially true when time is critical for the success of a new product rollout – and go-to-market campaigns need tight, internal coordination and cooperation between such diverse areas as R&D, engineering, marketing, production and sales. Proper leveraging of PIM assists in helping everyone sing from the same page.
Where PIM is strongest is where the organization is spread over different time zones and across international borders. For example, an online computer integrator and direct seller could find employing a PIM solution advantageous. The same hardware could be marketed and offered within different sales areas (or channels); and each channel would receive the most up-to-date and accurate specifications, prices and marketing strategies.
Therefore, when a specific sales campaign begins, it could be rolled-out via all sales channels; and the firm would know all targeted market segments and areas are being equally and accurately represented – so potential customers in Finland would be assured of receiving their online or hard copies of brochures in the proper language and with the proper sales slant. Secondly, the brochures would reflect the most current and accurate pricing. Therefore, product cataloging is another benefit PIM vendors tend to highlight – meaning sales/marketing documents and brochures are kept up to date, are relevant to current inventories and may be pre-positioned for projected new product introductions. For seasonal campaigns, such as "back to school," PIM can save uncounted hours of frustration and anxiety by leveraging its coordination capabilities. Once again, the buzzword “synchronization” comes into play.
Where MySAP PLM diverges from traditional PIM is its use of advanced analytical capabilities via trend capturing, portfolio management and its integration with other SAP applications such as SRM and CRM. PLM also enables data sharing with vendors and other supplier networks. Finally, MySAP PLM assists the MM/PP side of the SAP enterprise via its seamless integration within demand planning, purchasing, MRP, etc.
What is RDM?
Reference data management (RDM) is actually a type of master data management. It just refers to a very specific type of data – reference data.
So, what is reference data? One view of reference data includes lookup tables, validation tables and code tables (such as status flag settings and status codes). These kinds of reference data are used to validate and categorize corresponding data within other tables.
There are also different types of reference data – classification types, status types and association types. Classification types might include industry SIC code (for industry type). Status type might set a flag or be a specific code if a customer’s credit privileges are blocked, etc. Association data types are associated with other data (typically customer master data) and may include such data as state and county tax rates.
Reference data is used extensively within an organization’s business rules – especially to validate, select (through its categorization use) and to apply specific arguments (i.e., if a customer’s credit flag or code is set to "blocked," the system will disallow further credit advances). Reference data is also typically leveraged to ensure the data within a field is accurate (meets strict field criterion) and is not duplicate information.
Reference data is also commonly used not only throughout an organization, but outside it as well. External financial statements (such as legal reporting documents) and tax statements are commonly shared. The reference data enables one database to recognize and accept the financial data (provided it was previously configured to do so). This is where the use of reference data definitions comes into play. Definitions permit others outside your organization to understand the relationship of the reference data to the data within a particular field. This is also helpful when your in-house programmers update or add new reference data – they require a point of reference before they can modify or create new data and associated definitions.
Finally, certain types of reference data (and sometimes the actual data content) are commonly derived from external sources. For example, SIC codes (industry codes) and credit scores (such as D&B kickbacks) are developed entirely outside the enterprise. This information, sometimes via a subscribed service, is brought into the reference tables for use. One such subscription example is used for real-time currency exchange rates. Another may be the sales and use tax rates set within the association reference types – perhaps provided by a service such as VerTex.
What is MDM? As with CDI, SRM and CRM and RDM, master data management brings various disparate data sources into one, consistent viewpoint.
There are various types of master data: vendor, customer, material, HR, etc. One key definition of any type of master data is persistent. Master data is the type of data that once you enter it (such as a customer’s name, address, etc.), it typically doesn’t change – it is persistent data. The aforementioned reference data relies on this data’s persistency; otherwise, business rules and the like would have little meaning.
Therefore, master data is the central information source for a company. By integrating master data within a central database, organizations can avoid data redundancy. Of course, this means you can’t have master data sitting on a sales person’s laptop or in the finance department’s data mart – all segregated from the enterprise system (that’s one noticeable advantage a true ERP system such as SAP will hold over functionally vertical best-of-breed systems). Ventana Research reported in late 2006, “ …21 percent of companies regard customer data as their top priority master data to manage.” Yet “ …only 14 percent completely trust the quality of their key master data.” When it comes to SAP customers, the issue becomes even more clouded. “ …around 80 percent of companies have bothenterprise resource planning (ERP) and CRM systems that contain core customer data. More than 40 percent of companies have three or more instances of their ERP system (6 percent have more than 100 instances). Despite the fact that most companies see customer data as the most important type of data, this dispersion makes it unlikely that they have consistent customer data across all systems.”
Let’s take a quick look at SAP master data. Internally, SAP recognizes three types of data:
- Control data
- Transactional data
- Master data
As previously mentioned, master data is persistent data (essentially does not change), whereas the transactional data is volatile and is being constantly updated, deleted, etc.
SAP makes use of customer, vendor, material, HR and other types of master data. All SAP components access this data – and use "lookup" tables for validation and data efficacy – the reference data.
Here are some examples of SAP master data – in this case, Vendor Master. The following is used during configuration:
Now, we reach SAP’s MDM. But let’s look at what SAP’s MDM is versus CDI and most other data integration tools.
First of all, SAP’s MDM (like most other MDM offerings) is not an integration application – essentially a one-time process. SAP’s MDM consists of several discrete applications, designed to work as a unified suite of applications and technologies. It’s a hybrid approach.
A key principle to remember is MDM should be viewed and approached toward the philosophy and practice of management. Management, by its very nature, is an ongoing process – one requiring constant husbanding. This equates to organizations undertaking an MDM initiative to put in place such entities as data governance boards or data quality committees. Structures such as these are required in order to support long-term and essentially non-ending tasks such as data quality assurance. Why institute such long-term requirements? Internal reorganizations and M&A activity are two reasons.
Figure 3: Master Data Operating Model
One key point to recognize is that master data management assumes you already have your master data consolidated – as opposed to master data integration, meaning your data is still scattered and not yet in a consolidated form. Therefore, unless you are using an integrated suite of MDM applications, such as SAP’s MDM offering, pre-implementation considerations must be addressed.
The bottom line regarding MDM in general is that master data management is a discipline, relying on techniques and discrete applications to ensure the data is properly aggregated and in a usable form (metadata included).
Initially, SAP’s MDM offering was seen as merely an extension to Business Warehouse (BW).
However, early adopters of SAP’s MDM soon discovered it offered much more. But this point-of-view
is understandable – SAP’s MDM is very transactional-centric in nature (since that is SAP’s inherent
strength). The latest version is also bringing BI into the forefront as well. This is SAP’s second
foray into MDM – the latest version (5.5 SP3) is the outgrowth of SAP acquiring A2i in July 2004
(it offered the application xCat) – a firm specializing in PIM (or content management). Initially,
SAP positioned the A2i addition to its MDM as MDME (for Extension) – releasing it while MDM 3.x was
still in release. However, MDM 5.5 is actually a blending of the two versions.
The A2i acquisition (A2i is now part of SAP Labs) provided SAP with robust data aggregation and synchronization. It has also improved SAP’s Web-based electronic catalogs for both customer and procurement deployments.
Furthermore, the current SAP MDM provides a framework to define its reference master data. For example, a particular item could be identified by a manufacturer, a supplier and a customer in different ways.
For example, the manufacturer may use the description "Item 21-A-ED," while a supplier's system has the value of “Center Bolt – Chrome25” for the same item. SAP is currently the only vendor offering the kind of data integration with a common user interface.
The following illustration shows the many-to-one key mapping capabilities of MDM 5.5.
Figure 4: MDM 5.5’s multiple key mapping – post object consolidation during
Different client systems, same master data object.
Finally, the latest version offers a common user and role-based interface access via its NetWeaver Enterprise Portal.
SAP’s latest version (5.5 SP3) consists of three distinct implementation phases:
- Loading master data
- Consolidating master data
- Distributing/harmonizing master data
Loading Master Data
The following graphic illustrates the various pieces involved in the loading process.
The client system master data is matched, normalized, cleansed and stored. It includes ID mapping for enterprise-wide analytics and reporting.
MDM BI Master Data Consolidation
Master Data Harmonization
Data harmonization ensures a higher quality of master data within the connected business systems by distributing the consolidated data. It also permits locally relevant data to be accessed as well.
Figure 5: The Distribution of the Master Data
Finally, your organization must manage the master data. As with any management project, proper planning will mitigate errors and increase your chances for success. Here are a few items any organization must contemplate prior to implementing an MDM strategy:
- Reach agreement on the long-term IT landscape and architecture.
- Build your internal data consolidation skill sets – reducing TCO where you can.
- Recognize the value of data consolidation – initiate a change management plan to ensure all
users and stakeholders share this vision.
- Identify where real or potential master data resides – whether within applications or existing
- Slowly begin building a “single version of truth” regarding key entities such as customers,
products, suppliers, employees, etc.
- Institute data governance and data quality assurance programs.
- Reach out to your external data source partners, vendors, suppliers and customers. Bring them into your vision of data management and let them assist you in composing the innovative process to differentiate your firm from others.
Via a central master data management system, organizations will realize the following:
- The benefits of data consolidation – your data is in one location.
- Central data ownership – this equates to quality assurance.
- Internal development of best practices and standards.
Summing It Up
Information management has come a long way in the last 20 years. The need for effective data management will not decrease, only grow. Managing enterprise and extra-enterprise data and transforming it into usable information are the keys to effective information management.
Organizations desiring or that view obtaining a single "golden" view of their master data as a critical initiative for their business should follow these best practices: I strongly urge any organization to begin by building the necessary support models, both financial and market-driven. Key stakeholders and sponsors must have a clear understanding where cost savings can be realized, revenues may be enhanced and how they intend to manage their TCO.
Organizations that truly desire to consolidate and manage data – not just within their enterprise, but throughout their business ecosystem – should employ a holistic approach. This means seeking out your suppliers, vendors and customers and incorporating them into your long-range information technology and data governance plans.