VSAM to DB2: Before you convert

There are reasons to convert from VSAM to DB2 -- and there are reasons not to. Find out what Craig Mullins has to say about a VSAM to DB2 migration project.

How do I convert from VSAM to DB2? Are there any free tools available or can we do this by doing some minor programming?

First of all, let's quickly discuss the difference between VSAM and DB2. VSAM is a file access method. The acronym stands for Virtual Storage Access Method. VSAM offers faster access than flat files because it uses an inverted index (B+tree). The index is used to access the data faster than simply reading every record until the right one is found. For more details comparing DB2 to VSAM, consult an article I wrote on the subject.

Now why would one need to convert from VSAM to DB2? If your application is running fine today and you do not have any additional data access or manipulation requirements for the VSAM data, then there is no need to convert the VSAM data to DB2. The predominant reason that people convert to DB2 is to ease data access.

It is much easier to write ad hoc SQL against DB2 tables than it is to code programs that access VSAM files. The time saved by not having to code the VSAM programs can cost justify a conversion of the data to DB2 if your reporting or manipulation needs are varied and frequent. But you should not do a "simple" conversion. By this I mean that the data needs to be analyzed in detail and a logical data model needs to be created. VSAM files quite often are not normalized and do not lend themselves to a "quick and dirty" conversion to DB2 (or any other "relational" DBMS). If you simply convert a VSAM file (records) to a DB2 table (rows), performance will surely suffer because DB2 is not meant to be used the same way as VSAM.

In other words, if conversion from VSAM to DB2 is pursued without knowledge of the application, the application will not be properly served. Why? Because DB2 uses SQL it accesses data a set-at-a-time. VSAM, on the other hand, accesses data a record-at-a-time. So there is an impedance mismatch between the application that is already written, and the new data storage mechanism – DB2. If you simply convert VSAM calls to DB2 SQL statements then you most likely will not take advantage of the power of SQL. You will not be joining data. You will not be formulating predicates properly because VSAM only accesses data by keys. You may be reading data that your programs do not require. This will result in diminished application performance – and nobody wants that, do they?

Indeed, the biggest problem that VSAM professionals encounter when moving to DB2 is treating the DB2 data like it is in a flat file. A mentality shift is required to think in sets instead of files, rows instead of records, and putting as much work as possible into the SQL predicates to allow DB2 to work as efficiently as possible.

Conversely, sometimes the VSAM proponents denigrate DB2 by calling it a pig. Yes, there is additional overhead when using DB2 instead of VSAM. DB2 does more than VSAM, so the overhead is warranted. Does that mean that VSAM outperforms DB2? Absolutely not!

If you understand DB2 and use it appropriately, its performance will be excellent. If you use DB2 like VSAM, its performance will stink. Think about it this way: compare a DB2 SELECT of four columns in a clustering index against the application code needed to access the same data by reading the entire VSAM file. In such a scenario, properly coded DB2 will undoubtedly outperform properly coded VSAM requests.

Flexibility is another important concern. DB2 is flexible and VSAM is not. If you do not believe that, then think about what it would take to add an index to existing data. With DB2, you add the index, rebind the program, and DB2 will take advantage of it without having to change any application code. With VSAM you would have to explicitly code requests to use the new index – not very flexible, is it?

And the robustness of the environment is another consideration. Running concurrent updates against the same VSAM file in batch and online is not nearly as efficient as doing the same with DB2. DBMS locking and ACID properties make such situations a clear-cut advantage for DB2.

Hopefully this summary of issues has helped you in your decision process. If you are looking for additional information on VSAM, consider reading the IBM redbook titled VSAM Demystified (SG24-6105). You might also want to look into IBM's VSAM Transparency product if you are tasked with converting VSAM applications to DB2. And good luck with your VSAM to DB2 conversions…

Read Craig's expert advice on a VSAM to DB2 project design.

Dig Deeper on IBM DB2 management