This article originally appeared on the BeyeNETWORK.
Before the introduction of data warehousing, we used online databases because of their online transaction processing capabilities. It seemed as if everything was covered. However, soon after this happened data warehouses were developed. With both online databases and data warehouses, we now believed we had everything that we needed.
But we still do not have everything.
People are now realizing there is another powerful technological wave behind data warehousing. This wave of processing is called the “near-line” or “archival” wave.
How are people discovering yet another major wave for information processing? There are several signs. One such sign is the accumulation of huge amounts of data. With the data warehouse, people began to collect detailed, historical data from a wide variety of sources. Historically, data warehouses were designed where data could be added but not deleted.
Over time, those data warehouses collected an enormous amount of data. They continue to collect data today.
So what will we do with all of this data? Clearly, data is expensive. There are also various other expenses involved, such as acquisition and ongoing operational costs.
People began asking what happened to all of the accumulated data. Eventually, they discovered that only a fraction of the data was used.
Although these organizations incur increased expenses to capture and store data, the majority of this data is not used. Each year, these organizations will be adding even more unused data.
From both a business and technological standpoint, this obviously does not make sense. Besides the high costs involved, capturing and collecting massive amounts of unusable data is wasteful.
Once an organization realizes this, they must make important decisions about their rarely used data. As a result, they must ask themselves various questions. Should we continue capturing the data? Should we throw away the data? Or should we store the data somewhere less expensively, where it can be accessed if necessary? Because of compliance issues, including but not limited to, compliance with Sarbanes-Oxley and the Bank Secrecy Act organizations must balance their on-line, near-line and archival storage requirements.
They can continue to store the unused data. However, companies that continue storing unused data encounter numerous problems. Annually, the amount of money thrown away will increase. Storing unused data is certainly not an attractive proposition.
Another option is throwing away the unused data. Since it took so much time and money to capture the data, throwing it away is difficult to justify. And what if this data is needed in the future? Once the data is thrown away, it might not be possible to recreate it.
In comparison to these two options, the third choice appears the most promising. Safely and economically storing the unused data where it can later be retrieved is the most attractive choice. This is the benefit of near-line or archival storage.
Bill Inmon is universally recognized as the father of the data warehouse. He has more than 36 years of database technology management experience and data warehouse design expertise. He has published more than 40 books and 1,000 articles on data warehousing and data management, and his books have been translated into nine languages. He is known globally for his data warehouse development seminars and has been a keynote speaker for many major computing associations. Bill can be reached at 303-681-6772.