News Stay informed about the latest enterprise technology news and product updates.

Data Warehousing: Our Great Debate Wraps Up

This last article will summarize our key debate points. reader feedback and answers to questions that have been received will be published in the blog.

This article originally appeared on the BeyeNETWORK.

In summary for the Inmon side of the debate: The Inmon data warehouse design model relies on a relational or 3NF design foundation  with data stored at the atomic level in the data warehouse, which is then aggregated and made accessible across the enterprise, by exploration warehouses, data mining warehouses and OLAP data bases. Ideally, it is built using the iterative or spiral development approach.

The CIF architecture embraces the star schema design for the data marts only, NOT for the design of the data warehouse. Inmon describes the Kimball approach as “brittle” because, in his opinion, the star schema is closely aligned to end-user requirements. Therefore, it does not produce a reusable form of data for the enterprise. In the Inmon approach, star schemas are only used for dependent data marts.

With regard to Mr. Kimball’s BUS architecture, Inmon’s comment is: “As changes [by Kimball] are made, Ralph’s architecture becomes inexorably closer to the CIF, which has been in the public domain for decades.”

The author’s summary for the Kimball side of the debate is distilled from “Differences of Opinion” (previously cited). The multi-dimensional data warehouse is not a “bottom up” design. His approach does not require a normalized data structure prior to dimensional presentation. Before loading the dimensional tables, the Kimball approach states that ”the data structures required prior to dimensional presentation depend on the source data realities, target data model and anticipated transformation.”

While the Kimball approach doesn’t absolutely shun the normalization requirement of the Inmon approach for the atomic data, it does analyze and challenge whether the enterprise has the need for “both the redundant ETL development and data storage and a clear understanding of the implicit Inmon two-step throughput. The presentation in the Kimball approach is only through data marts. No physical data warehouse, as required by the Inmon approach, is required.

The Kimball approach is presented as being faster because it argues that the data doesn’t need to go through ETL multiple times before being accessed by the business. The positions regarding the atomic level of data have been explored in the previous article in this series. However, Kimball’s point is “If you make atomic data available in dimensional structures, you can always summarize the data “any which way.”

Storing the atomic data in dimensional structures provides business users with the ability to get answers to immediate and, sometimes unpredictable, problems.

According to the Kimball approach, this puts usable data in the hands of the business user making the query without requiring a data warehouse expert to drill into the different normalized structures for the data.

Mr. Kimball also points out that his approach uses the enterprise data warehouse BUS architecture “with common, conformed dimensions for integration and drill-across support. Conformed dimensions are the backbone of any enterprise approach …”

In rebuttal, from Mr. Inmon, the Inmon approach would accept the premise of Kimball stated above,” If you make the atomic data available in dimensional structures, you can always summarize the data ‘any which way,’” BUT only to the extent that the atomic data in the dimensional structures is being analyzed using ONLY multi-dimensional methods. The Inmon approach would contend that statistical, mining and even exploratory methods cannot be used if the atomic data is made available only in dimensional states.

In the Inmon approach, the main reason for storing the data in a 3NF fashion in the data warehouse is not to predispose the data to favor any particular analytical method.

As you can see, there are many more similarities than differences between the architectures once you get past the semantics.

Which is Better?

The answer is, of course, it depends—on how you cleanse your data; the level of granularity you choose to access it; the variety of analytical techniques you use to analyze the data, the time and resources you have to build it and your prevailing corporate culture.

Whether you decide to “Punch In” to the Inmon Corporate Information Factory or “get on” the Kimball BUS, we hope you have enjoyed this series and we look forward to your comments. 

This concludes the “Great Debate.”

The author wishes to thank Bill Inmon for his source material and also to acknowledge Claudia Imhoff, Intelligent Solutions; Dan Meers, Knightsbridge Consulting; Joyce Montanari, Independent Consultant; Genia Neuschloss, Gavrosche; Derek Strauss, Gavrosche: and Bob Terdeman, Independent Consultant for their assistance and insights on the Inmon approach.

While the author relied heavily on Mr. Kimball’s website, articles in Intelligent Enterprise and on the Design Tips from Kimball University to attempt to present his approach for this series, she does not expect him to reply.

The reply received by Business Intelligence Network from Mr. Kimball’s office appears below.

“As for the article series, we are constantly developing new content for our existing writing commitments,” was the Kimball office response. “We can’t review and edit/correct everything that’s written about dimensional modeling and the Kimball Methods to ensure accuracy. We’ve decided to pass on a review of your series rather than establishing any precedent for other similar requests."

The author does welcome any rebuttal or clarification to the Kimball approach she has presented from readers who have adopted or employed that approach.

Thank you for your interest in “The Great Debate.”

Dig Deeper on Data warehouse software