The following is a book excerpt from DB2 9 for Linux, UNIX, and Windows: DBA Guide, Reference, and Exam Prep, 6th Edition, by George Baklarz and Paul Zikopoulos. It is reprinted here with permission from International Business Machines Corporation; Copyright 2008. Read the chapter excerpt below to learn about the basics of DB2 tools and products or download a free .pdf of this chapter: "DB2 tools and products for Linux, UNIX and Windows: The basics."
In this chapter you will be introduced to the DB2 family of products that run on the Linux, UNIX, and Windows operating systems. This version of DB2 is often referred to as the distributed version to differentiate it from the DB2 for z/OS® version that runs on an IBM mainframe.
DB2 has the ability to store all kinds of electronic information. This includes traditional relational data, data encoded within XML as well as structured and unstructured binary information, documents and text in many languages, graphics, images, multimedia (audio and video), information specific to operations like engineering drawings, maps, insurance claims forms, numerical control streams, or any type of electronic information. This chapter illustrates some of the ways to access data in a DB2 database using some of the interfaces provided within the DB2 family. A description of each of the DB2 products are provided to illustrate some of DB2's features and functions.
Information as a Service
The DB2 Data Server is an important part of IBM's Information as a Service software portfolio that serves as the atomic level for the broader IBM On Demand architecture.
Figure 1-1 IBM Service Framework for an On Demand business
In Figure 1–1 you can see that the IBM software portfolio has really evolved into a collection of high value services provided by various IBM software portfolio offerings. The backbone fabric of this IBM reference architecture is the Enterprise Service Bus (ESB) that is used to facilitate communications across this rich set of services.
IT Service Management is mostly provided by various Tivoli® products. The Tivoli portfolio is built around four key disciplines or pillars:
- Security Management
- Storage Management
- Performance and Availability
- Configuration and Operations
Services from these pillars can be used to collectively manage your entire IT framework. For example, Tivoli Storage Resource Manager services can be used enterprise-wide to monitor and report on heterogeneous storage resources to increase storage utilization, identify and resolve potential problems, and ensure application availability through policy-based automation.
Development Services are the culmination of various Rational-based products that are built on the open source Eclipse platform. For example, Rational® ClearCase® provides source control services, and Rational Application Development empowers application developers with a rich set of services that can be used to develop applications, Web pages, and extended custom services for implementation in a Services Oriented Architecture (SOA) or loosely coupled application framework.
Services that enable interaction are typically part of the Lotus® suite of products that enhance collaboration and idea sharing across the enterprise and beyond. Products like Lotus Sametime® Connect can be used for messaging and more.
A number of services in the framework illustrated in Figure 1-1 are provided by the WebSphere® portfolio. For example, a product like WebSphere Integration Developer helps you define business process flows in the standard Business Process Execution Language (BPEL), which are used to implement process services that in turn help you define, orchestrate, and automate business policies. The Enterprise Service Bus (ESB) is provided by the WebSphere ESB product that provides your enterprise services with transformation, transport switching, and routing remediation among other services. Perhaps the most famous product of the WebSphere brand is the WebSphere Application Server that provides a runtime framework for J2EE®-based operations that are part of the Infrastructure Services component.
Finally there's the Information Services which represent the superset of the capabilities you'll learn about in this book. The specific set of services that are typically found in this part of the IBM reference architecture are shown in Figure 1-2.
Figure 1-2 IBM Information Services defined
The services shown to the right in Figure 1-2 are hierarchical in nature. In other words, as you work from bottom to top, the services provided become richer and more business oriented.
For example, Master Data Management services are provided by the WebSphere Product Center and WebSphere Customer Center products. Master data are facts that describe your core business entities: customers, suppliers, partners, products, bill of materials, employees, and so on. The discipline of Master Data Management seeks to decouple master information from individual applications spread across the enterprise and create a central, application independent, resource. The end result is a simplification of ongoing integration tasks and new application development. This discipline addresses key issues such as data quality and consistency proactively rather than "after the fact"; for example, in a data warehouse (a lower service in this taxonomy). There is also a set of entity resolution services that fit within the Master Data Management service tier.
Business Intelligence services are provided by the DB2 Data Warehouse editions that you'll learn more about later in this chapter. Content Manager services are provided by the set of Content Management products and are used for document management, archiving, regulatory retention, and are a basis by which unstructured information (such as FAXes, video, voicemail, and so on) can be searched and folded into the information asset.
Information Integration services seek to provide enterprises with ways to share, place, publish, cleanse, and enrich data found in the lower-level data management services. WebSphere Federation Server and its parent WebSphere Information Server are two such products that help implement these services.
Finally, the Data Management services tier is the foundation upon which the other services are built. IBM has a number of data servers that fit into this tier, including DB2, Informix®, IBM Cloudscape™, U2, and IMS™.
This book is specifically about DB2 in this service tier. Specifically, you'll learn how DB2 can provide any number of the high-value data-centric services shown in Figure 1-3.
Figure 1-3 The data services provided by DB2, the focus of this book
For more information on the entire IBM software portfolio and how they are mapped to the illustrated services shown Figures 1-1 and 1-2, refer to the IBM Web site at www.ibm.com for more details.
The DB2 family of data servers executes on Windows, Linux (which can be run on the entire spectrum of IBM's hardware: System i™, System z™, System x™, and System p™), Solaris™ (both SPARC®-based and Intel®/AMD™-based installations), HP-UX™ (both PA-RISC™-based and Itanium-based installations), i5/OS®, VSE/VM, z/OS, and on pervasive platforms (like Windows Mobile Edition, Blue- Cat® Linux, Symbian®, Palm OS®, J2ME® platforms like the RIM® lackberry®, and more).
The DB2 code base is optimized for each platform to ensure maximum performance and integration. DB2 for Linux, UNIX, and Windows shares about a 98 percent common code base with platform-specific optimizations at the operating system interface (OSI) level (Figure 1-4).
This means that once you've learned how to administer a DB2 for AIX system, for the most part you'll know how to manage DB2 for Linux or DB2 for Windows; this is the reason why there is a single DB2 certification for all the distributed platforms.
Figure 1-4 The DB2 code for Linux, UNIX, and Windows is virtually the same
DB2 for i5/OS and DB2 for z/OS are optimized for their respective environments. For example, DB2 for z/OS is based on a shared-everything disk architecture where the hardware-assisted Coupling Facility is used to serialize access to the shared disk. No such hardware exists for Linux, UNIX, and Windows, and therefore DB2 on these platforms uses a shared-nothing architecture. For this reason, administration tends to vary between these platforms (though many concepts and features are similar). However, the SQL API is 98% common to all the platforms where DB2 runs, allowing applications written on one platform to be easily ported to another. This means that you can build an application on DB2 for Windows and port it effortlessly to DB2 for z/OS. If you build your application according to the SQL Reference for Cross-Platform Development handbook, your application will be 100% portable across the DB2 family.
There are other synergies among DB2 running on Linux, UNIX, and Windows, as well as the other DB2 family members. For example, the JDBC driver used for DB2 for z/OS is exactly the same code as is used for DB2 for Linux, UNIX, and Windows. So while there may be variations in specific data definition language (DDL)-based tasks, the data manipulation language (DML) and client APIs are similar.
The DB2 for Linux, UNIX, and Windows Data Server
In the distributed environment, DB2 is available in a number of different packaging options, called editions. Furthermore, DB2 is also available as part of other packages that contain additional features and tooling rather than just the base data services provided by DB2.
The mainstream DB2 editions are shown in Figure 1-5:
Figure 1-5 The distributed DB2 family
For the most part, each edition builds on its child in this hierarchy. For example, if a feature or functionality is available in DB2 Workgroup Edition, it's likely that it's also a part of a higher-level edition, like DB2 Enterprise Edition.
DB2 Everyplace Edition
DB2 Everyplace (DB2e) is a tiny "fingerprint" database that's about 350K in size. It is designed for low-cost, low-power, small form-factor devices such as personal digital assistants (PDAs), handheld personal computers (HPCs), and embedded devices. DB2e runs on a wide variety of handheld devices, with support for Palm OS 5.x, Windows Mobile 2003 for Pocket PC, Windows Mobile 2005 for Pocket PC, Windows CE.NET, traditional Windows desktop platforms, Symbian OS Version 7/7s, QNX® Neutrino® 6.2, Linux distributions running with the 2.4 or 2.6 kernel, embedded Linux distributions (like BlueCat) running with the 2.4 or 2.6 kernel, and more.
The SQL API used to develop DB2e applications is a subset of that used for building full-fledged DB2 data server applications. This means that enterprise applications, for the most part, can be easily extended to include mobile devices. More importantly, it means that if you have DB2 skills, you have DB2e skills. In addition, DB2e is extremely flexible for developers, with support for Open Database Connectivity (ODBC), Java Database Connectivity (JDBC), .NET (including the ADO.NET 2.0 API), and the DB2 Call Level Interface (CLI) APIs.
DB2e is a very simple-to-use data server that requires virtually no maintenance. Typical database administrator (DBA) operations like reorganizations and statistics collection are all performed automatically. Another nice thing about developing DB2e applications is that the database engine is platform independent, so it provides flexibility: You can seamlessly move DB2e databases between devices. For example, you could move a DB2e database populated on a Pocket PC device to a Symbian smartphone, or whatever other supported device you have, without the need to do anything. This feature, coupled with the rich support for application development, enables developers to quickly build, deploy, and support mobile applications on all platforms.
DB2e is available in two editions: DB2 Everyplace Database Edition (DB2e DE) and DB2 Everyplace Enterprise Edition (DB2e EE). The database component of DB2e DE is the same as DB2e EE; however, DB2e DE has no synchronization middleware to extend or synchronize data to back-end enterprise data servers (although it does come with command line-based import and export utilities). DB2e DE is primarily used for applications that require an embedded database or a local relational storage facility that is exposed to endusers through some sort of application (they never really see the database) yet have stringent footprint requirements because of the device.
DB2e EE distinguishes itself from DB2e DE in that it comes with a data synchronization component called the DB2e Synchronization Server (DB2e Sync Server). The DB2e Sync Server allows you to manage subscriptions and security controls for data that is distributed wirelessly to your hand-held devices and manage data changes on the client devices back into the data center. The DB2e Sync Server also comes with facilities for conflict resolution, application deployment, device identification controls, management policies, and more.
The DB2e Sync Server can synchronize DB2e and Apache Derby/IBM Cloudscape data servers with back-end JDBC-enabled compliant data servers (for example, DB2, Oracle, Informix, SQL Server™, and so on). In addition, there is a special DB2 family synchronization adapter that uses the Data Propagator™ (DPROPR) SQL-based replication technology (which is included in the distributed version of DB2).
The number of concurrent synchronizations that the DB2e Sync Server can support is dependent on the hardware configuration of that server, the associated workload, and data change rates. If you need to scale to handle very large concurrent synchronizations, you can install any Java application server (like IBM WebSphere Application Server). DB2e also supports enhanced scalability and high-availability through its support for DB2e Sync Server farm configurations that allow you to cluster a number of DB2e Sync Servers to provide load balancing and high-availability services.
Figure 1-6 A DB2e Enterprise Edition environment
In Figure 1-6 you can see the flow of data in a DB2e EE environment. For example, data is pulled from a database in Tier 3 (the far right of the figure) and placed on a mobile device in Tier 1 (the far left). Tier 1 is typically composed of occasionally connected clients that operate on data and then use the services provided by Tier 2 (the middle of the figure where the DB2e Sync Server resides) to push those changes back to Tier 3. Tier 2 handles issues like conflict remediation and subscription management to ensure that the data quality is maintained throughout its lifecycle until it's at rest.
Apache Derby/IBM Cloudscape
In 2005, IBM donated $85 million worth of relational database management system (RDBMS) code to the open source community, and the Apache Derby database was born. Apache Derby and IBM Cloudscape are the same databases; the difference is that IBM Cloudscape is sold by IBM with IBM's award-winning 24*7 support and has some add-on features as well.
If you hadn't heard of IBM Cloudscape before the donation news, you'll probably be surprised to learn how many partners, customers, and software packages use this data server. In fact, more than 80 different IBM products use the IBM Cloudscape data server for its portability, easy deployment, open standards-based Java engine, small footprint, and more. IBM Cloudscape is a component that is transparent to products such as WebSphere Application Server, DB2 Content Manager, Web- Sphere Portal Server, IBM Director, Lotus Workplace, and many others.
IBM Cloudscape is a Java-based RDBMS that has a 2MB footprint. It's compatible with DB2, supports advanced functions (such as triggers and stored procedures), is easy to deploy, and requires no DBA effort. These same characteristics hold true for the open source Apache Derby as well.
We chose to include the Apache Derby/IBM Cloudscape data servers in this discussion because their SQL API is 100% compatible with the DB2 data server editions in Figure 1–5. This means that you can take any Apache Derby/IBM Cloudscape database and application and move it to a full-fledged DB2 data server if you need more scalability, or you need to take advantage of features that aren't found in these data servers. In fact, a component of DB2 9, called the DB2 Developer Workbench, provides a built-in facility to migrate Apache Derby/IBM Cloudscape schemas and data to a DB2 data server.
DB2 Personal Edition
DB2 Personal Edition (DB2 PE) is a full-function database that enables single users to create databases on their workstations. Since it's limited to single users (it doesn't support inbound client request for code), it's generally not referred to as a data server (although the DB2 engine behind DB2 PE is that same DB2 engine for all editions in Figure 1-5). This product is only available on Linux and Windows. DB2 PE can also be used as a remote client to a DB2 data server. Applications written to execute on DB2 PE are fully portable to the higher-level editions of the DB2 family in Figure 1-5.
DB2 PE is often used by end users requiring access to local and remote DB2 databases, or developers prototyping applications that will be accessing other DB2 databases. In addition, since it includes the pureXML™ technology free of charge, DB2 PE is also a good choice for those looking to acquire DB2 9 pureXML skills. In many cases, because it includes replication features, DB2 PE is used for occasionally connected applications (like field research, sales force automation, and so on) where a richer feature set is required than what's offered by DB2e or Apache Derby/IBM Cloudscape.
More information about IBM DB2
- Continue reading about the basics of DB2 tools and products by downloading a free .pdf of "DB2 tools and products for Linux, UNIX and Windows: The basics."
- Read other excerpts from data management books in the Chapter Download Library.
- Listen to a podcast about IBM DB2 9 certifications with Roger Sanders.