Home > The data warehouse ETL toolkit
Chapter Download:
EMAIL THIS

The data warehouse ETL toolkit

26 Oct 2005 | Written by Ralph Kimball and Joe Caserta; Reprinted with permission from John Wiley & Sons

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   

This excerpt from The Data Warehouse ETL Toolkit is from Chapter 1, Requirements, realities and architecture. Download this entire chapter for FREE.

What the Data Warehouse Is Not

What constitutes a data warehouse is often misunderstood. To this day, you can ask 10 experts to define a data warehouse, and you are likely to get 10 different responses. The biggest disparity usually falls in describing exactly what components are considered to be part of the data warehouse project. To clear up any misconceptions, anyone who is going to be part of a data warehouse team, especially on the ETL team, must know his or her boundaries.

The environment of a data warehouse includes several components, each with its own suite of designs, techniques, tools, and products. The most important thing to remember is that none of these things alone constitutes a data warehouse. The ETL system is a major component of the data warehouse, but many other components are required for a complete implementation. Throughout our experiences of implementing data warehouses, we've seen team members struggling with the same misconceptions over and over again. The top five things the data warehouse is mistaken to be are as follows:

    More information on this book
    Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data
    By Ralph Kimball, Joe Caserta
    Published by John Wiley & Sons Incorporated
    ISBN: 0764567578 
    Published: October 2004; 525 pp
    Buy this book online
  1. A product. Contrary to many vendor claims, you cannot buy a data warehouse. A data warehouse includes system analysis, data manipulation and cleansing, data movement, and finally dimensional modeling and data access. No single product can achieve all of the tasks involved in building a data warehouse.

  2. A language. One cannot learn to code a data warehouse in the way you learn to implement XML, SQL, VB, or any other programming language. The data warehouse is composed of several components, each likely to require one or more programming or data-specification languages.

  3. A project. A properly deployed data warehouse consists of many projects (and phases of projects). Any attempt to deploy a data warehouse as a single project will almost certainly fail. Successful data warehouses plan at the enterprise level yet deploy manageable dimensional data marts. Each data mart is typically considered a separate project with its own timeline and budget. A crucial factor is that each data mart contains conformed dimensions and standardized facts so that each integrates into a single cohesive unit—the enterprise data warehouse. The enterprise data warehouse evolves and grows as each data mart project is completed. A better way to think of a data warehouse is as a process, not as a project.

  4. A data model. A data model alone does not make a data warehouse. Recall that the data warehouse is a comprehensive process that, by definition, must include the ETL process. After all, without data, even the best-designed data model is useless.

  5. A copy of your transaction system. A common mistake is to believe copying your operational system into a separate reporting system creates a data warehouse. Just as the data model alone does not create a data warehouse, neither does executing the data movement process without restructuring the data store.

Download this entire chapter for FREE. (No registration required.)

Read other exerpts from data management books in the Chapter Download Library.

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
Data warehouse software
Why pay for a data warehouse appliance when you can get one free?
In-database analytics pulls together SAS, data warehouse vendors
Teradata takes a logical approach to data warehousing appliances
BT taps open source BI software, homegrown DW to unlock customer data
Bill pushes for data warehouse, XBRL to track TARP funds
Teradata VP talks data warehouse appliances, reveals cloud and SSD plans
Data Warehouse Platforms Product Directory
Commodity hardware aiding data warehouse appliance performance, costs
What does MapReduce and in-database technology mean for data warehouses?
Columnar databases, appliances, cloud computing top BI trends

Enterprise data architecture best practices
Advantages and disadvantages of XML shredding
How to shred XML with the DB2 XMLTABLE function
Shredding XML docs into relational tables with annotated XML schemas
Teradata takes a logical approach to data warehousing appliances
Examples of single and bulk XML shredding of XML documents
What is the difference between a logical and physical warehouse design?
What are some emerging data warehouse and DBMS trends?
Teradata VP talks data warehouse appliances, reveals cloud and SSD plans
Selecting ODBC functions for optimized SQL statements
Guidelines for managing data updates to optimize ODBC performance

Data integration tools
What's the problem with hand-coding scripts for data integration, anyway?
Data integration software tools: Gartner Magic Quadrant names top vendors
Informatica to acquire Itemfield for unstructured data access
Extract, transform and load (ETL) tools transforming, study finds
Oracle takes integrated approach with G-Log buy
Data governance software: The truth about "one-size-fits-all" data governance "solutions"
Data integration tools can only solve part of the integration mystery
Informatica powers up PowerCenter 8

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data modeling  (SearchDataManagement.com)
extract, transform, load  (SearchDataManagement.com)
OLAP  (SearchDataManagement.com)
tree structure  (SearchDataManagement.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary




Data Compliance Articles and Research: Data Privacy, Financial Data Management, Healthcare Data
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts