Guidelines for managing data updates to optimize ODBC performance

Looking to optimize your ODBC applications and functions? Find guidelines for updating data and managing your data updates efficiently and how to improve ODBC performance.

Data Access Handbook cover

This section from The Data Access Handbook will provide you with best practices and guidelines for updating data and managing your updates efficiently to optimize ODBC performance. Read a definition of pseudo-columns and learn about their role in SQL and ODBC functions. Find out how to minimize the slow retrieving catalog functions in your ODBC application and how to use a dummy query to determine table characteristics. 


Table of Contents

Designing for performance: Strategic database application deployments 
An introduction to database transaction management
Executing SQL statements using prepared statements and statement pooling
Database access security: network authentication or data encryption?
Static SQL vs. dynamic SQL for database application performance 
Improving ODBC application performance and coding
Selecting ODBC functions for optimized SQL statements
Guidelines for managing data updates to optimize ODBC performance

Updating Data

Use the guidelines in this section to manage your updates more efficiently.

Using SQLSpecialColumns to Optimize Updates and Deletes
Many databases have hidden columns, named pseudo-columns, that represent a unique key associated with every row in a table. Typically, pseudo-columns in a SQL statement provide the fastest way to access a row because they usually point to the exact location of the physical record.

Some applications, such as an application that forms a Where clause consisting of a subset of the column values retrieved in the result set, cannot be designed to take advantage of positioned updates and deletes. Some applications may formulate the Where clause by using searchable result columns or by calling SQLStatistics to find columns that may be part of a unique index. These methods usually work but can result in fairly complex queries. For example:

 rc = SQLExecDirect (hstmt, "SELECT first_name, last_name," +
           "ssn, address, city, state, zip" +
           "FROM employees", SQL_NTS);
// fetch data using complex query

rc = SQLExecDirect (hstmt, "UPDATE employees SET address = ?" +
           "WHERE first_name = ? AND last_name = ?" +
           "AND ssn = ? AND address = ? AND city = ? AND" +
           "state = ? AND zip = ?", SQL_NTS);

Many databases support pseudo-columns that are not explicitly defined in the table definition but are hidden columns of every table (for example, ROWID for Oracle). Because pseudo-columns are not part of the explicit table definition, they're not retrieved when SQLColumns is called. To determine if pseudo-columns exist, your application must call SQLSpecialColumns. For example:

rc = SQLSpecialColumns (hstmt, SQL_BEST_ROWID, ...);
rc = SQLExecDirect (hstmt, "SELECT first_name, last_name," +
           "ssn, address, city, state, zip," +
           "ROWID FROM employees", SQL_NTS);
// fetch data and probably "hide" ROWID from the user
rc = SQLExecDirect (hstmt, "UPDATE employees SET address = ?" +
           "WHERE ROWID = ?", SQL_NTS);
// fastest access to the data!

If your data source doesn't contain pseudo-columns, the result set of SQLSpecialColumns consists of the columns of the most optimal unique index on the specified table (if a unique index exists). Therefore, your application doesn't need to call SQLStatistics to find the smallest unique index.

Using Catalog Functions

Catalog functions retrieve information about a result set, such as the number and type of columns. Because catalog functions are slow compared to other ODBC functions, using them frequently can impair performance. Use the guidelines in this section to optimize performance when selecting and using catalog functions.

Minimizing the Use of Catalog Functions
Compared to other ODBC functions, catalog functions that generate result sets are slow. To retrieve all result column information mandated by the ODBC specification, an ODBC driver often must perform multiple or complex queries to retrieve the result set for a single call to a catalog function.

In addition to avoid executing catalog functions multiple times, you should cache information retrieved from result sets generated by catalog functions. For example, call SQLGetTypeInfo once, and cache the elements of the result set that your application depends on. It's unlikely that any application will use all elements of the result set generated by a catalog function, so the cache of information shouldn't be difficult to maintain.

Avoiding Search Patterns
Catalog functions support arguments that can limit the amount of data retrieved. Using null values or search patterns, such as %A%, for these arguments often generates time-consuming queries. In addition, network traffic can increase because of unnecessary results.

In the following example, an application uses the SQLTables function to determine whether the table named WSTable exists and provides null values for most of the arguments:

 rc = SQLTables(hstmt, null, 0, null, 0, "WSTable",
           SQL_NTS, null, 0);

The driver interprets the request as follows: Retrieve all tables, views, system tables, synonyms, temporary tables, and aliases named WSTable that exist in any database schema in the database catalog.

In contrast, the following request provides non-null values for all arguments, allowing the driver to process the request more efficiently:

 rc = SQLTables(hstmt, "cat1", SQL_NTS, "johng", SQL_NTS,
nbsp;          "WSTable", SQL_NTS, "Table", SQL_NTS);

The driver interprets the request as follows: Retrieve all tables in catalog "cat1" that are named "WSTable" and owned by "johng." No synonyms, views, system tables, aliases, or temporary tables are retrieved.

Sometimes little is known about the object that you are requesting information for. Any information that the application can provide the driver when calling catalog functions can result in improved performance and reliability.

Using a Dummy Query to Determine Table Characteristics
Sometimes you need information about columns in the database table, such as column names, column data types, and column precision and scale. For example, an application that allows users to choose which columns to select may need to request the names of each column in the database table.

The following examples show the benefit of using the SQLDescribeCol function over the SQLColumns function.


Example A: SQLColumns Function

A potentially complex query is prepared and executed, the result description information is formulated, the driver retrieves the result rows, and the application fetches the result. This method results in increased CPU use and network communication.

 rc = SQLColumns (... "UnknownTable" ...);
// This call to SQLColumns will generate a query to the
// system catalogs... possibly a join which must be
// prepared, executed, and produce a result set.
rc = SQLBindCol (...);
rc = SQLExtendedFetch (...);
// user must retrieve N rows from the server
// N = # result columns of UnknownTable
// result column information has now been obtained

Example B: SQLDescribeCol Function

A simple query that retrieves result set information is prepared, but the query is not executed and result rows are not retrieved by the driver. Only information about the result set is retrieved (the same information retrieved by SQLColumns in Example A).

 // prepare dummy query
rc = SQLPrepare (... "SELECT * FROM UnknownTable" +
     "WHERE 1 = 0" ...);
// query is never executed on the server - only prepared
rc = SQLNumResultCols (...);
for (irow = 1; irow <= NumColumns; irow++) {
     rc = SQLDescribeCol (...)
     // + optional calls to SQLColAttributes
// result column information has now been obtained
// Note we also know the column ordering within the table!
// This information cannot be
// assumed from the SQLColumns example.

What if the database server, such as a Microsoft SQL Server server does not support prepared statements by default? The performance of Example A wouldn't change, but the performance of Example B would decrease slightly because the dummy query is evaluated in addition to being prepared. Because the Where clause of the query always evaluates to FALSE, the query generates no result rows and executes the statement without retrieving result rows. So, even with a slight decrease in performance, Example B still outperforms Example A.


The performance of ODBC applications can suffer if they fail to reduce network traffic, limit disk I/O, simplify queries, and optimize the interaction between the application and driver. Reducing network communication probably is the most important technique for improving performance. For example, when you need to update large amounts of data, using arrays of parameters rather than executing an Insert statement multiple times reduces the number of network round trips required to complete the operation.

Typically, creating a connection is the most performance-expensive task your application performs. Connection pooling can help you manage your connections efficiently, particularly if your application has numerous users. Regardless of whether your application uses connection pooling, make sure that your application closes connections immediately after the user is finished with them.

Making smart choices about how to handle transactions can also improve performance. For example, using manual commits instead of auto-commit mode provides better control over when work is committed. Similarly, if you don't need the protection of distributed transactions, using local transactions can improve performance.

Inefficient SQL queries slow the performance of ODBC applications. Some SQL queries don't filter data, causing the driver to retrieve unnecessary data. Your application pays a huge penalty in performance when that unnecessary data is long data, such as data stored as a Blob or Clob. Even well-formed SQL queries can be more or less effective depending on how they are executed. For example, using SQLExtendedFetch instead of SQLFetch and using SQLBindCol instead of SQLGetData reduces ODBC calls and improves performance.

More on accessing data:



Dig Deeper on Data management tutorials