This article originally appeared on the BeyeNETWORK.
One of the premises of modern systems architecture is that of the recognition of the life cycle of data. Data enters the corporation, is integrated, ages and then falls into disuse. As the probability of access of data falls, the data is placed on slower media.
This brings us to a discussion of archival data. In years past, archival data was a graveyard. I recall my first experience with archival data. I was working at a gas pipeline company in El Paso, Texas. I was a young system developer, just learning the ropes. I was always a curious sort of person, and one day I noticed that there was a door leading to a large room. I asked one of the people working in the area what was going on in the room. The personwas kind enough to open the door and show me a rather large room – full of racks of magnetic tape.
This was the tape archive room. I asked what the tapes were for, and I was told that the room was where the company stored its archival information. I then said – “Wow. You mean I could go through here and process all sorts of data from the past?” No, they told me, I really couldn’t do that. I was told that when a magnetic tape entered the archive, it really couldn’t be used again. Indeed, we opened one of the canisters holding an old tape and dust fell out. Actually, it wasn’t dust. It was oxide falling off the tape, which, of course, rendered the tape unusable. Once the oxide falls off of a tape, no amount of reconstruction can make it readable once again.
So I asked, “If the tapes in archives can’t be used, what good are they?” I was told that every year, the company had to pass an audit. Every year, the auditor asked if the company kept archives, and every year they went to this room and showed the auditor the tapes. Then, the company got a check mark for having an archive facility.
Today, the whole notion of archival data has changed. If you can’t access archival data, why have it in the first place?
With that in mind, there is data in corporations where future probability of that data being accessed is deemed to be zero. When looking at such data, the questions asked are: Should I archive data that will never be looked at again? Or, should I just throw it away? In most cases, the answer is – surprisingly – the data should be archived even though its future probability of access is near zero. There actually are several good cases for archiving data that will, in all likelihood, never be used again.
What are the business reasons for archiving data that has a zero probability (or close to zero probability) of ever being accessed again?
One reason is litigation. Some data needs to be stored away for a rainy day in the off chance that the data may be useful in a lawsuit. Who knows what the future holds in terms of lawsuits? It is probably legitimate to say that there is no way of predicting future lawsuits? Thus, because the data might have some unknown future usage in helping the company defend itself, storing the data makes sense.
Another reason is legislative mandate. There are many rules and laws that mandate the storage and future accessibility of data. It may not make sense to us to store such data, but at one time it made sense to some legislator in some Senate building. Therefore, legislative mandateis another reason for the storing of data that has a very low probability of access.
There is another commonsense reason for storing data that has zero probability of access, and that reason is the cost of data reconstruction. If, by some chance, in the future there is a need to find some record of data, if the data has been stored, then it can be retrieved. However, if the record must be manually reconstructed, the cost of that reconstruction can be exorbitantly high. Or, in the worst case, the data is lost altogether and cannot even be restored manually.
These are the interesting reasons for archiving data that appears to have no further use for the corporation.