This article originally appeared on the BeyeNETWORK.
A big consideration of archival processing is keeping up with software and hardware releases.
Suppose an organization looks upon archiving as merely the process of taking data and storing it elsewhere. In order to illustrate this point, suppose that today an organization stores two or three tables in archival processing in Oracle8i. Occasionally, the organization goes back into the archive environment and brings some data out.
Everything is fine until the day in 2008 when the organization upgrades to the latest release of Oracle12k. Then, when they go to get data out of archival storage, they face a dilemma. The data has been stored in Oracle8i but needs to be read as Oracle12k. Oracle12k may or may not support Oracle8i. And as time passes, the chances of the latest version of Oracle supporting the “old” data in Oracle 8i grows slimmer. Perhaps next year’s version of Oracle will be able to look backward a few releases. However, 20 or 30 years from now, the odds are excellent that Oracle8i and its formats and conventions will be long forgotten.
Now the organization faces a real dilemma. They have archival data stored in incompatible technology.
Just in case you are interested, it is not just Oracle that faces this problem of release compatibility over time. Indeed, all software vendors face this problem.
The first problem in trying to make archival data useful is the tracing backward of formats and conventions of releases over time. That in itself is a daunting task. Certainly the consumer should not be shackled with this task; but even the software vendor faces this task with trepidation.
And even if there is a path from release to release over time, the complexity of trying to follow and implement that path is, at the very least, very complex – and at the worst, it is overwhelmingly complex, to the point that it just can’t be done.
For these reasons then, it simply does not make sense to take archival data and leave it in the format and conventions of a technology as it is originally stored. In every case, if data is to be archived, it needs to be freed from the technology in which it is stored on a day-to-day basis and placed in a simple, universal technology that will not cause anyone great problems in the accessing of the archival data at a later point in time.
As has been discussed in Part 1 of this series, the proper form for archival data is in the form of a time vault. A time vault does, in fact, contain archival data. However, the time vault is formatted and structured in the simplest manner possible. Simple flat files based on simple ASCII or EBCDIC data is the preferred mode for the design of the archival data. However it is done, the time vault of data should not have reference to or dependence on DBMS technology. There should be no connection to the vendor that originally housed the data that has been archived.
The result is a form of archival data that is a little less convenient to read and manipulate; however, the archival data is set to stand the test of time – and standing the test of time is the number one requirement for archival data.