The following is an excerpt from Understanding DB2: Learning visually with examples, 2nd edition, by Raul Chong,...
Xiaomei Wang, Michael Dang and Dwaine Snow. It is reprinted here with permission from International Business Machines Corporation; Copyright 2008. Read the book excerpt below or download a free .pdf of the chapter: "Understanding IBM DB2: Product history and strategy."
Database 2 (DB2) for Linux, UNIX, and Windows is a data server developed by IBM. Version 9.5, available since October 2007, is the most current version of the product, and the one on which we focus in this book.
In this chapter you will learn about the following:
- The history of DB2
- The information management portfolio of products
- How DB2 is developed
- DB2 server editions and clients
- How DB2 is packaged for developers
- Syntax diagram conventions
1.1 Brief history of DB2
Since the 1970s, when IBM Research invented the Relational Model and the Structured Query Language (SQL), IBM has developed a complete family of data servers. Development started on mainframe platforms such as Virtual Machine (VM), Virtual Storage Extended (VSE), and Multiple Virtual Storage (MVS). In 1983, DB2 for MVS Version 1 was born. "DB2" was used to indicate a shift from hierarchical databases—such as the Information Management System (IMS) popular at the time—to the new relational databases. DB2 development continued on mainframe platforms as well as on distributed platforms.1 Figure 1.1 shows some of the highlights of DB2 history.
Figure 1.1 DB2 timeline
In 1996, IBM announced DB2 Universal Database (UDB) Version 5 for distributed platforms. With this version, DB2 was able to store all kinds of electronic data, including traditional relational data, as well as audio, video, and text documents. It was the first version optimized for the Web, and it supported a range of distributed platforms—for example, OS/2, Windows, AIX, HP-UX, and Solaris—from multiple vendors. Moreover, this universal database was able to run on a variety of hardware, from uniprocessor systems and symmetric multiprocessor (SMP) systems to massively parallel processing (MPP) systems and clusters of SMP systems.
Even though the relational model to store data is the most prevalent in the industry today, the hierarchical model never lost its importance. In the past few years, due to the popularity of eXtensible Markup Language (XML), a resurgence in the use of the hierarchical model has taken place. XML, a flexible, self-describing language, relies on the hierarchical model to store data. With the emergence of new Web technologies, the need to store unstructured types of data, and to share and exchange information between businesses, XML proves to be the best language to meet these needs. Today we see an exponential growth of XML documents usage.
IBM recognized early on the importance of XML, and large investments were made to deliver pureXML technology; a technology that provides for better support to store XML documents in DB2. After five years of development, the effort of 750 developers, architects, and engineers paid off with the release of the first hybrid data server in the market: DB2 9. DB2 9, available since July 2006, is a hybrid (also known as multi-structured) data server because it allows for storing relational data, as well as hierarchical data, natively. While other data servers in the market, and previous versions of DB2 could store XML documents, the storage method used was not ideal for performance and flexibility. With DB2 9's pureXML technology, XML documents are stored internally in a parsed hierarchical manner, as a tree; therefore, working with XML documents is greatly enhanced. In 2007, IBM has gone even further in its support for pureXML, with the release of DB2 9.5. DB2 9.5, the latest version of DB2, not only enhances and introduces new features of pureXML, but it also brings improvements in installation, manageability, administration, scalability and performance, workload management and monitoring, regulatory compliance, problem determination, support for application development, and support for business partner applications.
DB2 is available for many platforms including System z (DB2 for z/OS) and System i (DB2 for i5/OS). Unless otherwise noted, when we use the term DB2, we are referring to DB2 version 9.5 running on Linux, UNIX, or Windows.
DB2 is part of the IBM information management (IM) portfolio. Table 1.1 shows the different IM products available.
Table 1.1 Information Management Products
|Information management products||Description||Product offerings|
|Data servers||Provide software services for the secure and efficient management of data and enable the sharing of information across multiple platforms||IBM DB2
|Data warehousing and business intelligence||Help customers collect, prepare, manage, analyze, and extract valuable information from all data types to help them make faster, more insightful business decisions.||DB2 Alphablox
DB2 Cube Views
DB2 Warehouse Edition
DB2 Query Management Facility
|Enterprise content management & discovery||Manage content, process, and connectivity. The content includes both structured and unstructured data, such as e-mails, electronic forms, images, digital media, word processing documents, and Web content. Perform enterprise search and discovery of information.||DB2 Content Manager
DB2 Common Store
DB2 CM OnDemand
DB2 Records Manager
FileNet P8 and its add-on suites
|Information integration||Bring together distributed information from heterogeneous environments. Companies view their information as if it were all residing in one place.||IBM Information Server integration software platform, consisting of:
- WebSphere Federation Server
- WebSphere Replication Server
- WebSphere DataStage
- WebSphere ProfileStage
- WebSphere QualityStage
- WebSphere Information
- WebSphere Metadata Server
- WebSphere Business Glossary
- WebSphere Data Event Publisher
1.2 The role of DB2 in the information on demand world
IBM's direction or strategy is based on some key concepts and technologies:
Information On Demand (IOD)
Service-Oriented Architecture (SOA)
In this section we describe each of these concepts, and we explain where DB2 fits in the strategy.
1.2.1 On-Demand business
We live in a complex world with complex computer systems where change is a constant. At the same time, customers are becoming more demanding and less tolerant of mistakes. In a challenging environment like this, businesses need to react quickly to market changes; otherwise, they will be left behind by competitors. In order to react quickly, a business needs to be integrated and flexible. In other words, a business today needs to be an on-demand business.
An on-demand business, as defined by IBM, is "an enterprise whose business processes -- integrated end to end across the company and with key partners, suppliers and customers -- can respond with speed to any customer demand, market opportunity, or external threat."
IBM's on-demand business model is based on this definition. To support the on-demand model, IBM uses the e-business framework shown in Figure 1.2.
Figure 1.2 The IBM e-business framework
In Figure 1.2 the dotted line divides the logical concepts at the top with the physical implementation at the bottom. Conceptually, the IBM e-business framework is based on the on-demand business model operating environment, which has four essential characteristics: It is integrated, open, virtualized, and autonomic. These characteristics are explained later in this section. The area below the dotted line illustrates how this environment is implemented by the suite of IBM software products.
- Rational is the "build" software portfolio; it is used to develop software.
- Information Management (where DB2 belongs) and WebSphere are the "run" software portfolios; they store and manipulate your data and manage your applications.
- Tivoli is the "manage" software portfolio; it integrates, provides security, and manages your overall systems.
- Lotus is the "collaborate" software portfolio used for integration, messaging, and collaboration across all the other software portfolios.
The IBM DB2 software plays a critical role in the on-demand operating environment. All elements of the Information Management portfolio, including DB2, are developed with the four essential characteristics of the on-demand business model in mind.
- Integrated: DB2 software has built-in support for both Microsoft and Java development environments. It is also integrated into WebSphere, Tivoli, Lotus, and Rational products. In addition, the DB2 family has cross-platform capabilities and can be integrated natively with Web services and message-queuing technologies. It also provides support for heterogeneous data sources for both structured and unstructured information, including pureXML support.
- Open: DB2 software allows for different technologies to connect and integrate by following standards. Thus, it provides strong support for the Linux operating system and for Java, XML, Web services, grid computing, and other major industry applications.
- Virtualized: Grid computing technology, a type of distributed computing, collects and shares resources in a large network to simulate one large, virtual computer. DB2 software products support grid computing technology through federation and integration technologies. Both of these are discussed in more detail later in this chapter.
- Autonomic: An autonomic computing system manages, repairs, and protects itself. As systems become more complex, autonomic computing systems will become essential. DB2 provides self-tuning capabilities, dynamic adjustment and tuning, simple and silent installation processes, and integration with Tivoli for system security and management.
The bottom of Figure 1.2 shows the operating systems in which the IBM software suite can operate: Linux, UNIX, Windows, i5/OS, and z/OS. Below that, the servers, storage, and network An on-demand business depends on having information available on demand, whenever it is needed, by people, tools, or applications. Information On Demand is discussed in the next section.
1.2.2 Information On Demand
Information On Demand, as its name implies, is making information available whenever people, tools, or applications demand or request it. This can be made possible by providing information as a service. IBM commonly uses the illustration in Figure 1.3 to explain what "information as a service" means. Let's use the following example to explain this concept in a more interesting way. Assume you are the general manager of a supermarket, and your main goal is to make this business profitable. To accomplish this, you must make good decisions, such as how to display items on shelves so that they sell more. In order to make good decisions, you need to have up-todate, reliable information.
Figure 1.3 Information as a service
As depicted at the bottom of Figure 1.3, many businesses today have a large number of heterogeneous sources of information. For this particular example let's assume your suppliers use SAP and DB2, your sales department uses an internally developed application, your smaller supermarket clients use Peoplesoft, and Oracle, and so on. Thus, you see several heterogeneous applications with semi-raw data, which will only be valuable to you if you can integrate them all. In order to integrate the data, it needs to be provided as a service, and this is possible through the use of standards such as JDBC and ODBC, and wrapping each of these applications as a Web service. Once the data are integrated, you may come up with decisions that might not have been logical otherwise, such as putting beer and diapers in the same aisle in order to sell more of both products.
With the data integrated you can further massage it to perform some additional analysis and get insightful relationships. This further massaging of the data can be performed by other software, such as entity analytics, master data, and so on as shown on the right side of the figure. Finally, this integrated data can be passed to other processes, tools and applications, and people for further analysis.
1.2.3 Service-Oriented Architecture
Service-Oriented Architecture (SOA), as its name implies, is an architecture based on services -- mainly Web services. SOA is not a product, but a methodology, a way to design systems that allow for integration, flexibility, loosely coupled components, and greater code reuse. With this architecture, business activities are treated as services that can be accessed on demand through the network.
Figure 1.4, which is also used in many IBM presentations, depicts the SOA lifecycle. It consists of four iterative steps or stages—Model, Assemble, Deploy, Manage—and a fifth step that provides guidance throughout the cycle: Governance & Processes.
Figure 1.4 The SOA Lifecycle
A more detailed explanation of each stage in the SOA lifecycle is provided in Table 1.2.
|SOA Stage||Description||IBM Tools That Can Be Used|
|Assemble||This stage is about building new services and/or reusing existing ones, and assembling them to form composite applications.||WebSphere Integration Developer
Rational Application Developer
|Deploy||In this stage your services and applications are deployed into a secure environment that integrates people, processes, and information within your business.||WebSphere Process Server
WebSphere Message Broker
WebSphere Partner Gateway
WebSphere Everyplace Deployment
Workplace Collaboration Services
WebSphere Information Integrator
WebSphere Application Server
|Manage||In this stage, you need to manage and monitor your system, find and correct inefficiencies and problems, deal with security, quality of service, and general system administration.||DB2 Content Manager
WebSphere Business Monitor
Tivoli Composite Application
Manager for SOA
Tivoli Identity Manager
|Governance||Governance underpins all the lifecycle stages. It ensures that all the services from inside and outside the organization are
controlled so the system does not spin out of control. Governance provides both direction and control.
1.2.4 Web Services
A Web service, as its name implies, is a service made available through the Web. A more formal, but still simple definition states that a Web service is a way for an application to call a function over the network; however, there is no need to know
- The location where this function will be executed
- The platform in which the function will run (for example Linux, UNIX, Windows, the mainframe, Mac OS/X, etc.)
- The programming language in which the function was created (for example Java, Cobol, C, etc.)
Web services are powerful because they allow businesses to exchange information with minimal or no human intervention. Let's go back to the supermarket example to see the power of Web services in a more realistic scenario:
Let's say you order 100,000 cookies from a supplier, expecting all of them to be sold in one month. After the month passes only 60,000 are sold, so you are left with 40,000. Because these are cookies of a special kind, they will spoil in two weeks. You need to act fast and sell them to other smaller supermarkets or Internet companies such as Amazon.com or eBay. You can grab the phone and spend an entire morning calling each of the smaller supermarket clients, offering them as many cookies as they would want to buy from you; or you could take a more "technical" approach and develop a simple application that would do this for you automatically. Assuming each of these smaller supermarket clients provide Web services, you could develop an application (in any programming language) that allows you to SQL insert overstocked items, such as the 40,000 cookies, into a DB2 database table overstock. You could then define a trigger on this table which invokes a DB2 stored procedure (more about triggers and stored procedures in Chapter 7, Working with Database Objects) that could consume Web services provided by the Internet companies or the smaller supermarket clients. This scenario is depicted in Figure 1.5.
Figure 1.5 Using a Web service
As you can see from Figure 1.5, the simple act of inserting 40,000 cookies through your application into the table overstock in the DB2 server allows the systems of many smaller supermarkets and Internet companies, through the use of their Web services, to make the cookies available on their systems quickly, opening new sales channels. In Figure 1.5, DB2 is behaving as a Web service consumer, because it is using or "consuming" the Web services, while the smaller supermarket clients and Internet companies are behaving as the Web service providers, because they are making these Web services available for others to use. For simplicity purposes, we have omitted in Figure 1.5 the call to a stored procedure. This scenario shows the power of Web services: business-to-business exchange of information using applications. There is no need for human intervention. DB2 and Web services will be discussed in more detail in Chapter 10, Mastering the DB2 pureXML Support.
XML stands for eXtensible Markup Language. XML's popularity and use has grown exponentially in the past few years, as it is a core component of many new technologies. The easiest way to understand how XML works is by comparing it to HTML, given that many people today are familiar with HTML. Let's take a look at the following line in an HTML document:
In the above line, the tag
indicates the way you would like to display the text, in this case, Raul in bold. Now Let's take a look at the following line in an XML document:
In the above line, the tag
describes the text Raul. The tag is saying that Raul is in fact a name. See the difference? In HTML, tags are used to indicate how you would like to display the data; in XML, tags are used to actually describe the data. Table 1.3 describes the characteristics of XML.
Table 1.3 Characteristics of XML
|Flexible||XML is a flexible language because it is easy to modify or adapt. XML is based on a hierarchical model, which is most appropriate to store unstructured types of information such as financial information, life sciences information (for example Genome, DNA), and so on.|
|Easy to extend||XML is easy to extend; that is, you can create your own tags. For example, in addition to the
|Describes itself||XML can describe itself; another document called an XML Schema (which itself is an XML document) is used to provide rules and descriptions as to what each of the tags in a document mean and restrict the type of data the tags can contain. An older method, but still widely used today, is to use DTD documents. In the above example, an XML Schema or DTD document can indicate that the tag
|Can be transformed to other formats||GXML can be transformed to other formats like HTML, using Extensible Stylesheet Language Transformations (XSLT), a language used for the transformation of XML documents.|
|Independent of the platform or vendor||XML is independent of the platform or vendor; after all, XML documents can be stored in text files containing tags. Text documents are supported everywhere.|
|Easy to share||XML is easy to share with other applications, businesses, and processes given that it can be stored as a text document. Because it is easy to share, it's appropriate as the core of Web services.|
XML is also at the core of Web 2.0 development technologies. Web 2.0, as defined in Wikipedia. org "refers to a perceived second generation of web-based communities and hosted services -- such as social-networking sites, wikis, and folksonomies -- which facilitate collaboration and sharing between users". Wikis, blogs, mash-ups, RSS or atom feeds, and so on, which are part of Web 2.0 development technologies, are all based on or related to XML. This makes DB2 9.5 the ideal data server platform for Web 2.0 development. Table 1.4 describes the different technologies that are part of Web 2.0. (To see Table 1.4 and to read more, download the free .pdf of this chapter.)
XML is discussed in more detail in Chapter 10, Mastering the DB2 pureXML Support.
Continue reading about IBM DB2 by downloading a free .pdf of the chapter: "Understanding IBM DB2: Product history and strategy."
Read other excerpts from data management books in the Chapter Download Library.
Listen to a podcast about IBM DB2 9 certifications with Roger Sanders.