News Stay informed about the latest enterprise technology news and product updates.

IBM fellow on DB2 V8 for zSeries

DB2 Version 8 for the zSeries has been generally available for five months. You haven't upgraded yet? You are not alone. Though the newest version of IBM's venerable database is full of goodies, upgrading is not a trivial task but requires much planning and thought. Curt Cotner, chief architect of DB2 for z/OS has worked on the development of DB2 for 16 years., In this interview, Cotner, who was recently named an IBM Fellow, the company's highest technical position, discusses the many new features in DB2 V8, including improved SQL commands, better use of Unicode and why you might want to accelerate your upgrade plans.

How has the uptick been for DB2 V8 for the zSeries? Are people upgrading at a good clip?
Well, it's about typical for what we see with new releases. The first three to five months after a new release is announced and is generally available, the uptick is somewhat slow and then starts to pick up momentum at the five- or six-month point and go forward. This release, as you know, is the biggest version of DB2 we have ever shipped, even bigger than version 1, release 1.

Curt Cotner
DB2 V8 features schema evolution. What does this allow users to do?
Clearly one of the big selling points is the schema evolution support we have put in. The big appeal there obviously is that schema evolution allows you to make changes to the database layout without an outage, and without taking all your data, unloading it into files and then reloading it into the database. For a lot of customers, that makes a difference between being able to make these kinds of changes or having to do something unnatural like create a new column that has the attributes you want and telling users to ignore the old column because it's no longer current.

With schema evolution, they can just issue a simple alter statement that says to change the employee ID column from 10 characters to 20 characters. The system is smart enough to recognize that we have both 10-character versions and 20-character versions and it knows which rows have which type.

DB2 V8 is all over Unicode. Why?
I think customers will be interested in all the work we have done for Unicode. There are a lot of packaged applications now that are really starting to emphasize Unicode and this gives us full compatibilities to support Unicode whether you are using Siebel, SAP or PeopleSoft. We've got the features those large vendors need to handle Unicode data. I also heard DB2 V8 has more support for 64-bit addressing. How will it help users? Will 64 bits of address space last a while?
It lets you make much, much bigger buffer pools. It gives some relief of the virtual storage constraints to customers who were starting to run short in their database address space. The buffer pool space we have [with 64-bit addressing] is literally going to last us for 20 years without any problem. Can you explain the significance of data-partitioned secondary?
There are really two issues. The first issue is customers have for years used partitioning to handle very large tables. So if you had many, many terabytes of data, the reason you wanted partitioning was to help you with failure isolation such as a disk failure. The problem we had with secondary indexes was that there was one secondary index for all the partitions. So if you lost the secondary index, you lost access to all the data through that particular index.

So one of the things data-partitioned secondary does is allow you to break up your index into small pieces just like you broke up your data into small pieces. And now if you have a disk failure on some part of the index, you limit the damage to a small subset of the rows. The rest of the rows are fully functional.

The other big advantage to data-partitioned secondary is a lot of customers use this partitioning scheme to allow them to store data by time range. Now, with data-partitioned secondary, we have an extremely efficient way to take the oldest index partition and just throw out the data from that one partition and make that the new partition and reload it with the new results.

I understand the new version allows users to do non-padded indexes. What's the difference between padded and non-padded indexing?
By default in the past, if you had variable length keys, we always padded them out to the maximum length so basically all the index entries are exactly the same length and each of the columns in the index were fixed length. It just makes scanning the indexes more efficient.

Now, inV8, we have the option you can encode the indexes so they really are variable length and they are not padded. It allows for shorter indexes so you can get more index entries on each page. But it also allows us now to do index-only access. If you store the data as not padded, we know exactly how long each of the variable length keys are and we can actually do field level retrieval directly from the index without having to look up the data in the data page to figure out how long things are.

What are the disadvantages to having non-padded indexes?
There is a little bit more CPU processing that takes place to examine each of the entries in the index, but a lot of times that can be offset by getting more keys on a single. So when you read a page off the disk into the buffer pool we have more index entries on one page. So it's a tradeoff. It varies on how much space you save by going to variable length.

Let's say you have a key that is 28 bits but nine times out of 10, you only use 26 out of the 28 bits. In that case, it's probably cheaper to keep it padded. But on the other hand, if you have a bunch of them that are only four or five bits long instead of the full length of 28 then going to a non-padded solution would be much better.

You've made a number of SQL improvements in V8. Can you explain?
SQL is probably where we spent at least half our resources for the release. So even though it was a massive release, half of it went to improving SQL. After that we tried to pick a set of SQL improvements that would really resonate with customers who are reengineering their applications to be Web-based. So, for example, the multi-row INSERT. Most people who do things with Web-based applications are loading the data from Unix or Windows clients most of the time. Instead of using a load utility, they are using INSERTs to load the data into the database. This new multi-row INSERT syntax that we provided in V8 is particularly well-suited to network transmission of large amounts of data.

We also tried to enhance the SQL so that you can be more flexible with how you write your applications. For example, the new SELECT from INSERT syntax. That one is particularly useful because in Web-based [applications] a lot of people are using things like triggers to modify the data rows before they get written to the disk.

Are there any plans to retrofit any of these new features to older versions of DB2? In other words, is upgrading the only way to get such functionality?
In general, the answer is always no. We don't want to spend our time upgrading previous versions of DB2 when we got V8 generally available and rock solid. There is really nothing to prevent customers from moving to V8. We have on occasion, if there was a critical business need for our customers, examined maybe retrofitting some small portion of V8 because it was not possible for a customer to upgrade to V8 in a time frame it needed.

Dig Deeper on IBM DB2 management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.