Quality goals for data warehousing

Carried to an extreme, quality can be a real enemy to progress.

This article originally appeared on the BeyeNETWORK.

“A good plan today is better than a perfect plan tomorrow.”
                                                         – General George C. Patton

Everyone agrees that quality of work – especially the end product – is a very worthwhile goal. It is simply folly to argue otherwise. Who wants sloppy or error-filled products, such as systems, database designs and so forth? No one does. Therefore, we should do everything we can to produce quality work of the highest caliber.

While these words are true and inarguable, when carried to an extreme, they can produce very specious results. When carried to an extreme, one way to never make a mistake is to never do anything. By never doing anything, no mistakes are made.

And that is exactly what some organizations – especially bureaucratic organizations – do. They try to do work by committee. The committee – in a desire to not make a mistake – does nothing. There is a room full of people – all of whom can say no – and no one who can say yes. But at least no one makes a mistake.

It is apparent that there is such a thing as carrying the goals of quality too far. Certainly, quality is a noble and worthwhile objective, but carried to an extreme, quality can be a real enemy to progress.

Nowhere is this dilemma more true than in the design and building of a data warehouse. The truth about a data warehouse is that a data warehouse is not built perfectly. The first iteration of data warehouse design is guaranteed to be faulty. In the case of a data warehouse, it is better to err on the side of building it quickly rather than building it perfectly.

Just why are data warehouses prone to being incomplete or being less than perfect? There are several reasons for the lack of quality of a data warehouse design. These include:

  • Data warehouses – when built properly – are built iteratively. This means that a data warehouse is necessarily incomplete when built. It is not until the fourth or fifth iteration that the data warehouse will start to resemble anything remotely close to being complete.

  • The requirements for a data warehouse are built on sand, not on a foundation of bedrock. The requirements for a data warehouse are not known in their entirety until the data warehouse is built. End users operate in a mode of discovery. They say – “Give me what I say I want, then I can tell you what I really want.” They don’t know what they want until they can see the possibilities. For this reason alone, end users will build a data warehouse imperfectly the first time they build it.

There probably are a host of other reasons why the data warehouse is built with less than perfection. This is at direct odds with the quality zealot who insists on having everything perfect before any real work can be done.

Please don’t take this discussion as proof that Bill Inmon doesn’t like quality. This article merely serves to point out the conflict that exists between the goal of quality/perfectionism and the need for progress in the arena of data warehouse development.


Bill InmonBill Inmon

Bill is universally recognized as the father of the data warehouse. He has more than 36 years of database technology management experience and data warehouse design expertise. He has published more than 40 books and 1,000 articles on data warehousing and data management, and his books have been translated into nine languages. He is known globally for his data warehouse development seminars and has been a keynote speaker for many major computing associations. Bill can be reached at 303-681-6772.

Dig Deeper on Data quality techniques and best practices