Home > Improving ODBC application performance and coding
Chapter Download:
EMAIL THIS

Improving ODBC application performance and coding

17 Sep 2009 | Written by: John Goodson and Robert A. Steward

Tips, expert advice and sample chapters
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google

Data Access Handbook cover
This chapter from The Data Access Handbook discusses how to create performance-optimized ODBC application code and includes general coding guidelines, best practices and several coding examples. In this section, learn how to optimize your ODBC applications by correctly managing your database connections and how to manage your transactions more efficiently.

Table of Contents

Designing for performance: Strategic database application deployments
An introduction to database transaction management
Executing SQL statements using prepared statements and statement pooling
Database access security: network authentication or data encryption?
Static SQL vs. dynamic SQL for database application performance
Improving ODBC application performance and coding
Selecting ODBC functions for optimized SQL statements
Guidelines for managing data updates to optimize ODBC performance

Chapter Five

ODBC Applications: Writing Good Code

Developing performance-optimized ODBC applications is not easy. Microsoft's ODBC Programmer's Reference does not provide information about performance. In addition, ODBC drivers and the ODBC Driver Manager don't return warnings when applications run inefficiently. This chapter describes some general guidelines for coding practices that improve ODBC application performance. These guidelines have been compiled by examining the ODBC implementations of numerous shipping ODBC applications. In general, the guidelines described in this chapter improve performance because they accomplish one or more of the following goals:

  • Reduce network traffic
  • Limit disk I/O
  • Optimize application-to-driver interaction
  • Simplify queries

If you've read the other coding chapters (Chapters 6 and 7), you'll notice that some of the information here resembles those chapters. While there are some similarities, this chapter focuses on specific information about coding for ODBC.

Copyright Info
The Data Access Handbook by John Goodson and Robert A. Steward
ISBN 0137143931
First Printing March 2009
Prentice Hall Professional
Managing Connections

Typically, creating a connection is one of the most performance-expensive operations that an application performs. Developers often assume that establishing a connection is a simple request that results in the driver making a single network round trip to the database server to validate a user's credentials. In reality, a connection involves many network round trips between the driver and the database server. For example, when a driver connects to Oracle or Sybase ASE, that connection may require seven to ten network round trips. In addition, the database establishes resources on behalf of the connection, which involves performance expensive disk I/O and memory allocation.

Your time will be well spent if you sit down and design how to handle connections before implementing them in your application. Use the guidelines in this section to manage your connections more efficiently.

Connecting Efficiently
Database applications use either of the following methods to manage connections:

  • Obtain a connection from a connection pool.
  • Create a new connection one at a time as needed.

When choosing a method to manage connections, remember the following facts about connections and performance:

  • Creating a connection is performance expensive.
  • Open connections use a substantial amount of memory on both the database server and the database client.
  • Opening numerous connections can contribute to an out-of-memory condition, which causes paging of memory to disk and, thus, overall performance degradation.

Using Connection Pooling
If your application has multiple users and your database server provides sufficient database resources, using connection pooling can provide significant performance gains. Reusing a connection reduces the number of network round trips needed to establish a physical connection between the driver and the database. The performance penalty is paid up front at the time the connection pool is populated with connections. As the connections in the pool are actually used by the application, performance improves significantly. Obtaining a connection becomes one of the fastest operations an application performs instead of one of the slowest.

Although obtaining connections from a pool is efficient, when your application opens and closes connections impacts the scalability of your application. Open connections just before the user needs them, not sooner, to minimize the time that the user owns the physical connection. Similarly, close connections as soon as the user no longer needs them.

To minimize the number of connections required in a connection pool to service users, you can switch a user associated with a connection to another user if your database driver supports a feature known as reauthentication. Minimizing the number of connections conserves memory and can improve performance. See "Using Reauthentication with Connection Pooling," page 232. See Chapter 8, "Connection Pooling and Statement Pooling," for details about connection pooling.

Performance Tip
If your application does not use connection pooling, avoid connecting and disconnecting multiple times throughout your application to execute SQL statements because of the performance hit your application pays for opening connections. You don't need to open a new connection for each SQL statement your application executes.
Establishing Connections One at a Time
Some applications are not good candidates for using connection pooling, particularly if connection reuse is limited. See "When Not to Use Connection Pooling," page 15, for examples.

Using One Connection for Multiple Statements
When you're using a single connection for multiple statements, your application may have to wait for a connection if it connects to a streaming protocol database. In streaming protocol databases, only one request can be processed at a time over a single connection; other requests on the same connection must wait for the preceding request to complete. Sybase ASE, Microsoft SQL Server, and MySQL are examples of streaming protocol databases.

In contrast, when connecting to cursor-based protocol databases, the driver tells the database server when to work and how much data to retrieve. Several cursors can use the network, each working in small slices of time. Oracle and DB2 are examples of cursor-based protocol databases. For a more detailed explanation of streaming versus cursor-based protocol databases, see "One Connection for Multiple Statements," page 16.

The advantage of using one connection for multiple statements is that it reduces the overhead of establishing multiple connections, while allowing multiple statements to access the database. The overhead is reduced on both the database server and client machines. The disadvantage is that the application may have to wait to execute a statement until the single connection is available. See "One Connection for Multiple Statements," page 16, for guidelines on using this model of connection management.

Performance Tip
Because of the performance hit your application pays for opening connections, once your application is connected, you should avoid establishing additional connections to gather information about the driver and the database, such as supported data types or database versions, using SQLGetInfo and SQLGetTypeInfo. For example, some applications establish a connection and then call a routine in a separate DLL or shared library that reconnects and gathers information about the driver and the database.
Obtaining Database and Driver Information Efficiently
Remember that creating a connection is one of the most performance-expensive operations that an application performs.

How often do databases change their supported data types or database version between connections? Because this type of information typically doesn't change between connections and isn't a large amount of information to store, you may want to retrieve and cache the information so the application can access it later.

Managing Transactions

To ensure data integrity, all statements in a transaction are committed or rolled back as a unit. For example, when you use a computer to transfer money from one bank account to another, the request involves a transaction—updating values stored in the database for both accounts. If all parts of that unit of work succeed, the transaction is committed. If any part of that unit of work fails, the transaction is rolled back.

Use the guidelines in this section to help you manage transactions more efficiently.

Managing Commits in Transactions

Committing (and rolling back) transactions is slow because of the disk I/O and, potentially, the number of network round trips required. What does a commit actually involve? The database must write to disk every modification made by a transaction to the database. This is usually a sequential write to a journal file (or log); nevertheless, it involves expensive disk I/O.

In ODBC, the default transaction commit mode is auto-commit. In autocommit mode, a commit is performed for every SQL statement that requires a request to the database (Insert, Update, Delete, and Select statements).When auto-commit mode is used, your application doesn't control when database work is committed. In fact, commits commonly occur when there's actually no real work to commit.

Some databases, such as DB2, don't support auto-commit mode. For these databases, the database driver, by default, sends a commit request to the database after every successful operation (SQL statement). The commit request equates to a network round trip between the driver and the database. The round trip to the database occurs even though the application didn't request the commit and even if the operation made no changes to the database. For example, the driver makes a network round trip even when a Select statement is executed.

Let's look at the following ODBC code, which doesn't turn off auto-commit mode. Comments in the code show when commits occur if the driver or the database performs commits automatically:

/* For conciseness, this code omits error checking */

/* Allocate a statement handle */
rc = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmt);
/* Prepare an INSERT statement for multiple executions */
strcpy (sqlStatement, "INSERT INTO employees " +
      "VALUES (?, ?, ?)");
rc = SQLPrepare((SQLHSTMT)hstmt, sqlStatement, SQL_NTS);

/* Bind parameters */
rc = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT,
          SQL_C_SLONG, SQL_INTEGER, 10, 0,
          &id, sizeof(id), NULL);
rc = SQLBindParameter(hstmt, 2, SQL_PARAM_INPUT,
          SQL_C_CHAR, SQL_CHAR, 20, 0,
          name, sizeof(name), NULL);
rc = SQLBindParameter(hstmt, 3, SQL_PARAM_INPUT,
          SQL_C_SLONG, SQL_INTEGER, 10, 0,
          &salary, sizeof(salary), NULL);

/* Set parameter values before execution */
id = 20;
strcpy(name, "Employee20");
salary = 100000;
rc = SQLExecute(hstmt);

/* A commit occurs because auto-commit is on */

/* Change parameter values for the next execution */
id = 21;
strcpy(name, "Employee21");
salary = 150000;
rc = SQLExecute(hstmt);

/* A commit occurs because auto-commit is on */

/* Reset parameter bindings */
rc = SQLFreeStmt((SQLHSTMT)hstmt, SQL_RESET_PARAMS);
strcpy(sqlStatement, "SELECT id, name, salary " +
      "FROM employees");

/* Execute a SELECT statement. A prepare is unnecessary
      because it's executed only once. */
rc = SQLExecDirect((SQLHSTMT)hstmt, sqlStatement, SQL_NTS);

/* Fetch the first row */
rc = SQLFetch(hstmt);
while (rc != SQL_NO_DATA_FOUND) {

/* All rows are returned when fetch
      returns SQL_NO_DATA_FOUND */

      /* Get the data for each column in the result set row */
      rc = SQLGetData (hstmt, 1, SQL_INTEGER, &id,
        v sizeof(id), NULL);
      rc = SQLGetData (hstmt, 2, SQL_VARCHAR, &name,
          sizeof(name), NULL);
      rc = SQLGetData (hstmt, 3, SQL_INTEGER, &salary,
        vsizeof(salary), NULL);
      printf("nID: %d Name: %s Salary: %d", id, name, salary);

      /* Fetch the next row of data */
      rc = SQLFetch(hstmt);
      }

/* Close the cursor */
rc = SQLFreeStmt ((SQLHSTMT)hstmt, SQL_CLOSE);

/* Whether a commit occurs after a SELECT statement
      because auto-commit is on depends on the driver.
      It's safest to assume a commit occurs here. */

/* Prepare the UPDATE statement for multiple executions */
strcpy (sqlStatement,
      "UPDATE employees SET salary = salary * 1.05" +
      "WHERE id = ?");

rc = SQLPrepare ((SQLHSTMT)hstmt, sqlStatement, SQL_NTS);

/* Bind parameter */
rc = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT,
      SQL_C_LONG, SQL_INTEGER, 10, 0,
      &index, sizeof(index), NULL);

for (index = 0; index < 10; index++) {

      /* Execute the UPDATE statement for each
        value of index between 0 and 9 */
      rc = SQLExecute (hstmt);

/* Because auto-commit is on, a commit occurs each time
      through loop for a total of 10 commits */
      }

/* Reset parameter bindings */
rc = SQLFreeStmt ((SQLHSTMT)hstmt, SQL_RESET_PARAMS);

/* Execute a SELECT statement. A prepare is unnecessary
      because it's only executed once. */

strcpy(sqlStatement, "SELECT id, name, salary" +
      "FROM employees");
rc = SQLExecDirect ((SQLHSTMT)hstmt, sqlStatement, SQL_NTS);

/* Fetch the first row */
rc = SQLFetch(hstmt);
while (rc != SQL_NO_DATA_FOUND) {

/* All rows are returned when fetch
        returns SQL_NO_DATA_FOUND */
/* Get the data for each column in the result set row */
rc = SQLGetData (hstmt, 1, SQL_INTEGER, &id,
        sizeof(id), NULL);
rc = SQLGetData (hstmt, 2, SQL_VARCHAR, &name,
        sizeof(name), NULL);
rc = SQLGetData (hstmt,3,SQL_INTEGER,&salary,
        sizeof(salary), NULL);
printf("nID: %d Name: %s Salary: %d", id, name, salary);

/* Fetch the next row of data */
rc = SQLFetch(hstmt);
}

/* Close the cursor */
rc = SQLFreeStmt ((SQLHSTMT)hstmt, SQL_CLOSE);

/* Whether a commit occurs after a SELECT statement
      because auto-commit is on depends on the driver.
      It's safest to assume a commit occurs here. */

Performance Tip
Because of the significant amount of disk I/O on the database server required to commit every operation and the extra network round trips that occur between the driver and the database server, it's a good idea to turn off auto-commit mode in your application and use manual commits instead. Using manual commits allows your application to control when database work is committed, which provides dramatically better performance. To turn off auto-commit mode, use the SQLSetConnectAttr function, for example, SQLSetConnectAttr(hstmt, SQL_ATTR_AUTOCOMMIT, SQL_AUTOCOMMIT_OFF).
For example, let's look at the following ODBC code. It's identical to the previous ODBC code except that it turns off auto-commit mode and uses manual commits:

/* For conciseness, this code omits error checking */

/* Allocate a statement handle */
rc = SQLAllocStmt((SQLHDBC)hdbc, (SQLHSTMT *)&hstmt);

/* Turn auto-commit off */
rc = SQLSetConnectAttr (hdbc, SQL_AUTOCOMMIT,
            SQL_AUTOCOMMIT_OFF);

/* Prepare an INSERT statement for multiple executions */
strcpy (sqlStatement, "INSERT INTO employees" +
      "VALUES (?, ?, ?)");
rc = SQLPrepare((SQLHSTMT)hstmt, sqlStatement, SQL_NTS);

/* Bind parameters */
rc = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT,
            SQL_C_SLONG, SQL_INTEGER, 10, 0,
            &id, sizeof(id), NULL);
rc = SQLBindParameter(hstmt, 2, SQL_PARAM_INPUT,
            SQL_C_CHAR, SQL_CHAR, 20, 0,
            name, sizeof(name), NULL);
rc = SQLBindParameter(hstmt, 3, SQL_PARAM_INPUT,
            SQL_C_SLONG, SQL_INTEGER, 10, 0,
            &salary, sizeof(salary), NULL);

/* Set parameter values before execution */
id = 20;
strcpy(name,"Employee20");
salary = 100000;
rc = SQLExecute(hstmt);

/* Change parameter values for the next execution */
id = 21;
strcpy(name,"Employee21");
salary = 150000;
rc = SQLExecute(hstmt);

/* Reset parameter bindings */
rc = SQLFreeStmt(hstmt, SQL_RESET_PARAMS);

/* Manual commit */
rc = SQLEndTran(SQL_HANDLE_DBC, hdbc, SQL_COMMIT);

/* Execute a SELECT statement. A prepare is unnecessary
      because it's only executed once. */
strcpy(sqlStatement, "SELECT id, name, salary" +
      "FROM employees");
rc = SQLExecDirect((SQLHSTMT)hstmt, sqlStatement, SQL_NTS);

/* Fetch the first row */
rc = SQLFetch(hstmt);
while (rc != SQL_NO_DATA_FOUND) {

/* All rows are returned when fetch
      returns SQL_NO_DATA_FOUND */

/* Get the data for each column in the result set row */
rc = SQLGetData (hstmt, 1, SQL_INTEGER, &id,
            sizeof(id), NULL);
rc = SQLGetData (hstmt, 2, SQL_VARCHAR, &name,
            sizeof(name), NULL);
rc = SQLGetData (hstmt, 3, SQL_INTEGER, &salary,
            sizeof(salary), NULL);
printf("nID: %d Name: %s Salary: %d", id, name, salary);

/* Fetch the next row of data */
rc = SQLFetch(hstmt);
}

/* Close the cursor */
rc = SQLFreeStmt ((SQLHSTMT)hstmt, SQL_CLOSE);

strcpy (sqlStatement,
      "UPDATE employees SET salary = salary * 1.05" +
      "WHERE id = ?");

/* Prepare the UPDATE statement for multiple executions */
rc = SQLPrepare ((SQLHSTMT)hstmt, sqlStatement, SQL_NTS);

/* Bind parameter */
rc = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT,
            SQL_C_SLONG, SQL_INTEGER, 10, 0,
            &index, sizeof(index), NULL);

for (index = 0; index < 10; index++) {

      /* Execute the UPDATE statement for each
        value of index between 0 and 9 */
      rc = SQLExecute (hstmt);
      }

/* Manual commit */
rc = SQLEndTran(SQL_HANDLE_DBC, hdbc, SQL_COMMIT);

/* Reset parameter bindings */
rc = SQLFreeStmt ((SQLHSTMT)hstmt, SQL_RESET_PARAMS);

/* Execute a SELECT statement. A prepare is unnecessary
      because it's only executed once. */
strcpy(sqlStatement, "SELECT id, name, salary" +
      "FROM employees");
rc = SQLExecDirect ((SQLHSTMT)hstmt, sqlStatement, SQL_NTS);

/* Fetch the first row */
rc = SQLFetch(hstmt);
while (rc != SQL_NO_DATA_FOUND) {

/* All rows are returned when fetch
      returns SQL_NO_DATA_FOUND */

  /* Get the data for each column in the result set row */
  rc = SQLGetData (hstmt, 1, SQL_INTEGER, &id,
      sizeof(id), NULL);
  rc = SQLGetData (hstmt, 2, SQL_VARCHAR, &name,
           sizeof(name), NULL);
  rc = SQLGetData (hstmt,3,SQL_INTEGER,&salary,
           sizeof(salary), NULL);
  printf("nID: %d Name: %s Salary: %d", id, name, salary);

  /* Fetch the next row of data */
  rc = SQLFetch(hstmt);
  }

/* Close the cursor */
rc = SQLFreeStmt ((SQLHSTMT)hstmt, SQL_CLOSE);

/* Manual commit */
rc = SQLEndTran (SQL_HANDLE_DBC, hdbc, SQL_COMMIT);

See "Managing Commits in Transactions," page 22, for information on when to commit work if you've turned off auto-commit mode.

Performance Tip
Distributed transactions, as defined by ODBC and the Microsoft Distributed Transaction Coordinator (DTC), are substantially slower than local transactions because of the logging and network round trips needed to communicate between all the components involved in the distributed transaction. Unless distributed transactions are required, you should use local transactions.
Choosing the Right Transaction Model
Which type of transaction should you use: local or distributed? A local transaction accesses and updates data on a single database. A distributed transaction accesses and updates data on multiple databases; therefore, it must be coordinated among those databases.

Be aware that the default transaction behavior of many COM+ components uses distributed transactions, so changing that default transaction behavior to local transactions as shown can improve performance.

   // Disable MTS Transactions.
   XACTOPT options[1] = {XACTSTAT_NONE,"NOT SUPPORTED"};
   hr = Itxoptions->SetOptions(options);

See "Transaction Management," page 21, for more information about performance and transactions.

More on accessing data:



Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Data warehouse strategy
Advantages and disadvantages of XML shredding
How to shred XML with the DB2 XMLTABLE function
Shredding XML docs into relational tables with annotated XML schemas
Examples of single and bulk XML shredding of XML documents
Definition of primary, super, foreign and candidate key in the DBMS
What are some emerging data warehouse and DBMS trends?
Selecting ODBC functions for optimized SQL statements
Guidelines for managing data updates to optimize ODBC performance
How to capture metadata information, ETL rules with CA Erwin Data Modeler
Designing for performance: Strategic database application deployments

Data management tutorials
Seven secrets to business intelligence (BI) success
How to design project management plan templates for large projects
Handling project management issues for a proprietary ownership conflict
Should we use traditional or agile software development processes?
Next steps for business intelligence: Video
Selecting ODBC functions for optimized SQL statements
Guidelines for managing data updates to optimize ODBC performance
Is it better to have a centralized or decentralized master data structure?
Can we leverage existing data quality tools for an MDM program?
Do we need business intelligence (BI) tools to be successful?

Data modeling tools and techniques
Understanding five major enterprise information management benefits
Advantages and disadvantages of XML shredding
How to shred XML with the DB2 XMLTABLE function
Shredding XML docs into relational tables with annotated XML schemas
Examples of single and bulk XML shredding of XML documents
How to capture metadata information, ETL rules with CA Erwin Data Modeler
Data Warehouse Platforms Product Directory
Data models serve as blueprint for business intelligence, master data management projects
Similarities and differences between ROLAP, MOLAP and HOLAP
Data modeling for the business: What is a data model?

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data modeling  (SearchDataManagement.com)
predictive modeling  (SearchDataManagement.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary




Data Compliance Articles and Research: Data Privacy, Financial Data Management, Healthcare Data
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts