This article originally appeared on the BeyeNETWORK
One of the salient characteristics of the DW2.0 environment is that it supports both local metadata and enterprise (or global) metadata.
Local metadata is the metadata found “on the ground.” Local metadata is created, accessed and managed in the confines of the technology in which the actual data resides. Examples of local metadata include:
- BusinessObjects universe
- Informatica SuperGlue
- IBM DB2 DBMS directory
- Oracle DBMS directory
- Column headings that are defined in a spreadsheet or a report
Local metadata is necessary, and it is a standard part of the information processing environment. However, there is another level of metadata that is needed, and it is metadata at the enterprise, or global, level. Enterprise metadata resides in what can be termed an enterprise metadata repository (EMR).
The data found in the EMR is useful for a completely different perspective of metadata. It is at the enterprise level of metadata that naming conventions, common definitions, overlap, redundancy and missing metadata can be examined. Enterprise metadata gives the analyst the opportunity to look across all of the metadata of the enterprise. This high level view of metadata allows the analyst to see things that would otherwise be very difficult to see.
When it comes time for a new project or analysis, enterprise metadata can be invaluable because it is through enterprise metadata that a new analysis or search can be planned or strategized.
How does local metadata become enterprise metadata? Is local metadata merely loaded directly into an EMR? In truth, there are several activities that typically take place when local metadata is loaded into the enterprise level. Some of the activities that need to occur during this transformation process are:
- Extraneous local metadata is stripped off. In almost every local metadata store, there will be metadata that is of use and interest only to the local metadata technology. This extraneous local metadata does nothing to enhance the description or meaning of the local metadata; therefore, it is removed before the metadata is sent to the enterprise level.
- Synonyms are recognized. In many cases at the enterprise level, there will be one name for something that exists locally under different names. The synonyms can be recognized and reconciled. There are two basic modes for the reconciliation of synonyms: replacement and concatenation. In the replacement mode, when a synonym is recognized, it is replaced. In a concatenated mode, when a synonym is recognized, it is concatenated into the text where the synonym resides. Both replacement and concatenation techniques have their place as local data is sent to the enterprise level.
- Text is deleted or added. It may be useful to add or delete metadata text as local metadata is elevated to enterprise metadata.
- Homograph resolution. On occasion, two (or more) local metadata stores will use the same terminology to mean something different. The distinction between the two needs to be made before the local metadata is passed to the enterprise level.
- Definitions can be added. A local metadata reference can be made into a global definition.
- Relationships can be defined. For example, a source and target relationship can be developed.
In addition to these alterations of local metadata, there are some other things that need to be passed to the enterprise level. These include:
- The original source of the local metadata must be identified.
- The date the local metadata was passed to the enterprise level needs to be identified.
- The destination of the enterprise level must be identified. (The EMR is divided into sections. For example, there may be one EMR section of metadata for integrated sector data and another EMR section for archival data.) The local metadata must be aimed at the correct place in the EMR.
- The technology source of the local metadata must be identified – DB2, BusinessObjects, Informatica, etc.
These are a few of the modifications that need to be made as local metadata is passed to enterprise metadata.
In 2006, I defined the next generation of data warehousing. The complete definition can be found on http://www.inmoncif.com/ under the section on DW2.0.