News Stay informed about the latest enterprise technology news and product updates.

Research systems as a source

Should you populate your data warehouse with data from your research data warehouse?

This article originally appeared on the BeyeNETWORK.

Most data warehouses are of the garden flower variety. A petunia here, a rose bush there. Most data warehouses are created for the purpose of supporting a rather standard and predictable set of queries that might come from accounting, finance, marketing, and so forth.

But every now and then, there is created what can be termed a “research data warehouse.” The research data warehouse is created almost on a whim. The research data warehouse is usually created for the purpose of analyzing one or two burning issues in the organization. Usually (but not necessarily always), the research data warehouse is for the purpose of statistical analysis.

The analysts that use the research data warehouse are almost always hand picked. Sometimes they are actuaries. On other occasions, they are just business users who are savvy in the ways of the business.

The research data warehouse is almost always project-based. This means that the research data warehouse has a beginning of life and an ending of life, unlike a standard data warehouse that is used perpetually.

There arises an interesting question – once a research data warehouse has been created, can data from the research data warehouse be placed in a standard data warehouse? In other words, can you create a research data warehouse, do some interesting analysis, then turn around and put analytical results from the research data warehouse into the standard data warehouse, thereby using the research data warehouse as a source of data for the standard data warehouse?

From a technological standpoint, of course you can do that. There is no technological reason why such a transfer of data cannot be made. However, there are some good reasons why – architecturally – such a practice is questionable.

Some of the reasons why you may not want to update a standard data warehouse from a research data warehouse include:

The finite life of the research data warehouse. If it is OK with you that at some point in time there will be no more data coming from the research data warehouse, then go ahead and update data into a standard data warehouse.

There usually is no discipline or rigor for the data associated with a research data warehouse. The essence of a research data warehouse is that data is free to be calculated and manipulated in any manner possible. This may or may not make a difference to the standard data warehouse.

No documentation and no audit trail. For a research data warehouse, generally speaking, there is no audit trail or documentation as to how data got inside the research data warehouse. If this is no problem, then feel free to update data into a standard data warehouse.

Subsets of data. Research data warehouses are notorious for operating only on subsets of data. If the fact that the data coming from a research data warehouse is incomplete is an issue, then you probably had better not be updating data into a standard data warehouse from a research data warehouse.

And there probably are many more considerations to take into account when deciding whether or not to update a standard data warehouse from a research data warehouse. These considerations merely scratch the surface.

So if you are dealing with a research data warehouse, you need to carefully consider whether you really want data from the research environment finding its way into your standard data warehouse. 

 Bill InmonBill Inmon

Bill is universally recognized as the father of the data warehouse. He has more than 36 years of database technology management experience and data warehouse design expertise. He has published more than 40 books and 1,000 articles on data warehousing and data management, and his books have been translated into nine languages. He is known globally for his data warehouse development seminars and has been a keynote speaker for many major computing associations. Bill can be reached at 303-681-6772.

Dig Deeper on Enterprise application integration software

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.