This excerpt from The Data Warehouse ETL Toolkit is from Chapter 1, Requirements, realities and architecture. Download this entire chapter for FREE.
What the Data Warehouse Is Not
What constitutes a data warehouse is often misunderstood. To this day, you can ask 10 experts to define a data warehouse, and you are likely to get 10 different responses. The biggest disparity usually falls in describing exactly what components are considered to be part of the data warehouse project. To clear up any misconceptions, anyone who is going to be part of a data warehouse team, especially on the ETL team, must know his or her boundaries.
The environment of a data warehouse includes several components, each with its own suite of designs, techniques, tools, and products. The most important thing to remember is that none of these things alone constitutes a data warehouse. The ETL system is a major component of the data warehouse, but many other components are required for a complete implementation. Throughout our experiences of implementing data warehouses, we've seen team members struggling with the same misconceptions
Requires Free Membership to View
When you register, you'll begin receiving targeted emails from my team of award-winning writers. Our goal is to keep you informed on the hottest data and information management trends today.
Hannah Smalltree, Editorial Director- A product. Contrary to many vendor claims, you cannot buy a data warehouse. A data
warehouse includes system analysis, data manipulation and cleansing, data movement, and finally
dimensional modeling and data access. No single product can achieve all of the tasks involved in
building a data warehouse.
- A language. One cannot learn to code a data warehouse in the way you learn to implement
XML, SQL, VB, or any other programming language. The data warehouse is composed of several
components, each likely to require one or more programming or data-specification languages.
- A project. A properly deployed data warehouse consists of many projects (and phases of
projects). Any attempt to deploy a data warehouse as a single project will almost certainly fail.
Successful data warehouses plan at the enterprise level yet deploy manageable dimensional data
marts. Each data mart is typically considered a separate project with its own timeline and budget.
A crucial factor is that each data mart contains conformed dimensions and standardized facts so
that each integrates into a single cohesive unit—the enterprise data warehouse. The enterprise data
warehouse evolves and grows as each data mart project is completed. A better way to think of a data
warehouse is as a process, not as a project.
- A data model. A data model alone does not make a data warehouse. Recall that the data
warehouse is a comprehensive process that, by definition, must include the ETL process. After all,
without data, even the best-designed data model is useless.
- A copy of your transaction system. A common mistake is to believe copying your operational system into a separate reporting system creates a data warehouse. Just as the data model alone does not create a data warehouse, neither does executing the data movement process without restructuring the data store.
Download
this entire chapter for FREE. (No registration required.)
Read other excerpts from data
management books in the Chapter Download Library.
This was first published in October 2005
Data Management Strategies for the CIO