This article originally appeared on the BeyeNETWORK.
Metadata is an essential part of any environment, especially a data warehouse environment dedicated to decision-support processing. However, metadata as an integral part of first generation data warehousing simply was not a reality in most places. In some cases, metadata was added to the data warehouse environment, sort of as a “bolt on” attachment.
Now we have DW2.0, the architecture for the next generation of data warehousing; and in DW2.0, metadata is an essential and integral part of the environment. With a solid definition of metadata for the DW2.0 environment, there are new possibilities for processing and information analysis.
There are three major components of the metadata infrastructure in DW2.0:
- The actual data that metadata is built upon
- Local metadata
- The enterprise metadata repository (EMR)
The actual data that metadata is built upon refers to the sectors of data – the interactive sector, the integrated sector, the near-line sector and the archival sector – and the data found in those sectors. The data in those sectors typically is found in tables, databases, reports, spreadsheets, etc. There is nothing different about a table found in DW2.0 and that found in a common first-generation data warehouse.
The second component of metadata found in DW2.0 is local metadata. Local metadata is metadata that resides with the actual data that is being portrayed. There are many forms of local metadata (for example, a DBMS directory, a BusinessObjects universe, column headings on a spreadsheet or the title/author/date of a document, etc.).
There are two salient characteristics of local metadata:
- The residency of the local metadata at the level or location of the actual data
- The ability to control or manage the local metadata locally
This means that the control of the contents of the metadata is done locally. For example, BusinessObjects universes are controlled by the business analyst using BusinessObjects, or the spreadsheet analyst controls the column headings of the spreadsheet on the desktop, or the database designer updates and creates database designs inside the DBMS directory. Nothing could be more natural and easy than local control of local metadata. It is the way that metadata is created and controlled today.
The third component of the DW2.0 metadata environment is the enterprise metadata repository (EMR). All metadata from throughout the enterprise is gathered in the EMR. Several activities occur as the metadata is sent from the local level to the enterprise level. These include:
- Metadata is gathered and edited. Unnecessary descriptive data is removed, and descriptive data is added.
- The date of the metadata gathering is noted.
- The local source of the metadata is noted.
- The synonyms and homographs are resolved, either through replacement or through concatenation.
- The local metadata can be split so that some metadata goes to one destination and other metadata goes to another destination.
- The destination metadata can be organized as the analyst sees fit.
- The enterprise metadata can be accessed dynamically, as part of the analytical process.
- Relationships internal to the forms of metadata can be established, such as source and target relationships between metadata entities.
It is noteworthy that metadata at the enterprise level is in a “load only” mode. If an analyst finds something incorrect with the metadata in the enterprise level, the correction must be made at the local level. Once the correction has been made at the local level, the local metadata is sent to the enterprise level.
For a more detailed description of DW2.0 and its components, refer to http://www.inmoncif.com/.