The ODBC API - HKT SOFT

Microsoft originally developed the Open Database Connectivity (ODBC) API to provide a database-brand-independent API for database access on its Windows operating systems. The early ODBC API became the foundation for the SQL/CLI standard, which is now the official ANSI/ISO standard for a SQL call-level interface. The original ODBC API was extended and modified during the standardization process to create the SQL/CLI specification. With the introduction of ODBC release 3.0, Microsoft brought ODBC into conformance with the SQL/CLI standard. With this revision, ODBC becomes a superset of the SQL/CLI specification.

ODBC goes beyond the SQL/CLI capabilities in several areas, in part because Microsoft’s goals for ODBC were broader than simply creating a standardized database access API. Microsoft also wanted to allow a single Windows application program to be able to concurrently access several different databases using the ODBC API. It also wanted to provide a structure where database vendors could support ODBC without giving up their proprietary APIs, and where the software that provided ODBC support for a particular brand of DBMS could be distributed by the database vendor and installed on Windows-based client systems as needed. The layered structure of ODBC and special ODBC management calls provide these capabilities.

1. The Structure of ODBC

The structure of ODBC as it is provided on Windows-based or other operating systems is shown in Figure 19-24. There are three basic layers to the ODBC software:

Callable API. At the top layer, ODBC provides a single callable database access API that can be used by all application programs. The API is packaged as a dynamic-linked library (DLL), which is an integral part of the various Windows operating systems.
ODBC drivers. At the bottom layer of the ODBC structure is a collection of ODBC drivers. There is a separate driver for each of the DBMS brands. The purpose of the driver is to translate the standardized ODBC calls into the appropriate call(s) for the specific DBMS that it supports. Each driver can be independently installed on a particular computer system. This allows the DBMS vendors to provide an ODBC driver for their particular brand of DBMS and distribute the driver independent of the Windows operating system software.

If the database resides on the same system as the ODBC driver, the driver is usually linked directly to the database’s native API code. If the database is to be accessed over a network, the driver may call a native DBMS client to handle the client/server connection, or the driver might handle the network connection itself.

Driver manager. In the middle layer of the ODBC structure is the ODBC driver manager. Its role is to load and unload the various ODBC drivers, on request from application programs. The driver manager is also responsible for routing the API calls made by application programs to the appropriate driver for execution.

When an application program wants to access a database via ODBC, it goes through the same initiation sequence specified by the SQL/CLI standard. The program allocates an environment handle, then a connection handle, and then calls SQLConnect(), specifying the particular data source to be accessed. When it receives the SQLConnect() call, the ODBC driver manager examines the connection information provided and determines the appropriate ODBC driver that is needed. The driver manager loads the driver into memory if it’s not already being used by another application program.

Subsequent calls by the application program on this particular CLI/ODBC connection are routed to this driver. The application program can, if appropriate, make other SQLConnect() calls for other data sources that will cause the driver manager to concurrently load other drivers for other DBMS brands. The application program can then use ODBC to communicate with two or more different databases, of different brands, using a uniform API.

2. ODBC and DBMS Independence

By providing a uniform API and its driver manager architecture, ODBC goes a long way toward providing a cross-vendor API for database access, but it’s impossible to provide fully transparent access. The ODBC drivers for the various database systems can easily mask cosmetic differences in their SQL dialects and API suites, but more fundamental differences are difficult or impossible to mask. ODBC provides a partial solution to this problem by providing several different levels of ODBC capability, and by making each ODBC driver self-describing through the ODBC/CLI calls that return information about general functionality, supported functions, and supported data types. However, the existence of different capability levels and profiles effectively pushes the DBMS differences right back into the application program, which must deal with this nonuniformity of ODBC drivers. In practice, the vast majority of application programs rely on only the basic, core set of ODBC functionality and don’t bother with more advanced features or profiles.

3. ODBC Catalog Functions

One of the areas where ODBC offers capability beyond the SQL/CLI standard is the retrieval of information about the structure of a database from its system catalog. As a part of the ANSI/ISO SQL standard, the CLI assumes that this information (about tables, columns, privileges, and so forth) is available through the SQL2 Information Schema, as described in Chapter 16. ODBC doesn’t assume the presence of an Information Schema. Instead, it provides a set of specialized functions, shown in Table 19-4, that provide information about the structure of a data source. By calling these functions and processing their results, an application program can determine, at runtime, information about the tables, columns, privileges, primary keys, foreign keys, and stored procedures that form the structure of a data source.

The ODBC catalog functions typically aren’t needed by an application program that is written for a specific purpose. However, they are essential for a general-purpose program, such as a query program, report generator, or data analysis tool. The catalog functions can be called any time after a connection to a data source has been made. For example, a report writing program might call SQLConnect() and then immediately call SQLTables() to determine which tables are available in the target data source. The tables could then be presented in a list on the screen, allowing the user to select which table should be used to generate a report.

All of the catalog functions return their information as if they were a set of query results. The application program uses the techniques already described for CLI query processing to bind the columns of returned information to program variable areas. The program then calls SQLFetch() to work its way through the returned information. For example, in the results returned by the SQLTables() call, each SQLFetch() retrieves information about one table in the data source.

4. Extended ODBC Capabilities

ODBC provides a set of extended capabilities beyond those specified in the SQL/CLI standard. Many of the capabilities are designed to improve the performance of ODBC-based applications by minimizing the number of ODBC function calls an application program must make and/or the amount of network traffic generated by the ODBC calls. Other capabilities provide useful features for maintaining database independence or aid an application program in the database connection process.

Some of the capabilities are provided through the additional set of ODBC function calls shown in Table 19-5. Others are provided through statement or connection attributes. Many of these additional capabilities were introduced in the 3.0 revision of ODBC and are not yet supported by most ODBC drivers or ODBC-based applications.

4.1. Extended Connection Capabilities

Two of the extended ODBC features are focused on the connection process. Connection browsing is designed to simplify the data source connection process and make it more database independent. SQLBrowseConnect() supports an iterative style of connection for access to ODBC data sources. An application program first calls the function with basic information about the target data source, and the function returns additional connection attributes needed (such as a user name or password). The application program can obtain this information (for example, by prompting the user) and then recalls SQLBrowseConnect() with the additional information. The cycle continues until the application has determined all of the information required for a successful SQLConnect() call.

The connection pooling capability is designed to improve the efficiency of ODBC connect/disconnect processing in a client/server environment. When connection pooling is activated, ODBC does not actually terminate network connections upon receiving a SQLDisconnect() call. Instead, the connections are held open in an idle state for some period of time and reused if a SQLConnect() call is made for the same data source. This reuse of connections can significantly cut down the network and login/logout overhead in client/server applications that involve short transactions and high transaction rates.

4.2. SQL Dialect Translation

ODBC specifies not just a set of API calls, but also a standard SQL language dialect that is a subset of the SQL2 standard. It is the responsibility of ODBC drivers to translate the ODBC dialect into statements appropriate for the target data source (for example, modifying date/time literals, quote conventions, keywords, and so on). The SQLNativeSQL() call allows the application program to see the effect of this translation. ODBC also supports escape sequences that allow an application program to more explicitly direct the translation of SQL features that tend to be less consistent across SQL dialects, such as outer joins and pattern-matching search conditions.

4.3. Asynchronous Execution

An ODBC driver may support asynchronous execution of ODBC functions. When an application program makes an asynchronous mode ODBC call, ODBC initiates the required processing (usually statement preparation or execution) and then immediately returns control to the application program. The application program can proceed with other work and later resynchronize with the ODBC function to determine its completion status. Asynchronous execution can be requested on a per-connection or a per-statement basis. In some cases, asynchronously executing functions can be terminated with a SQLCancel() call, giving the application program a method for aborting long-running ODBC operations.

4.4. Statement-Processing Efficiency

Each ODBC call to execute a SQL statement can involve a significant amount of overhead, especially if the data source involves a client/server network connection.

To reduce this overhead, an ODBC driver may support statement batches. With this capability, an application program can pass a sequence of two or more SQL statements as a batch to be executed in a single SQLExecDirect() or SQLExecute() call.

For example, a series of a dozen INSERT or UPDATE statements could be executed as a batch in this way. It can significantly reduce network traffic in a client/server environment, but it complicates error detection and recovery, which tend to become driver-specific when statement batches are used.

Many DBMS products address the efficiency of multistatement transactions in a different way. They support stored procedures within the database itself, which can collect a sequence of SQL operations, together with the associated flow-control logic, and allow the statements to be invoked with a single call to the procedure. ODBC provides a set of capabilities that allow an application program to directly call a stored procedure in the target data source. For databases that allow stored procedure parameters to be passed by name, ODBC allows parameters to be bound by name instead of by position. For data sources that provide metadata information about stored procedure parameters, the SQLDescribeParam() call allows the application program to determine, at runtime, the required parameter data type. Output parameters of a stored procedure are supported either through SQLBindParam() (in which case, the application program’s data buffer is modified upon return from the SQLExecute() or SQLExecDirect() call) or through SQLGetData(), which allows retrieval of long returned data.

Two other extended ODBC capabilities provide efficiency when a single SQL statement (such as an INSERT or UPDATE statement) is to be executed repeatedly. Both address the binding of parameters for this situation. With the binding offset feature, once a statement parameter has been bound and the statement has been executed, ODBC allows the application program to change its binding for the next statement execution by specifying a new memory location as an offset from the original location. This is an effective way of binding a parameter to individual items in an array for repeated statement execution. In general, modifying an offset value is much more efficient than rebinding the parameter with repeated calls to SQLBindParam().

ODBC parameter arrays provide an alternative mechanism for an application program to pass multiple sets of parameter values in a single call. For example, if an application program needs to insert multiple rows into a table, it can request execution of a parameterized INSERT statement and bind the parameters to arrays of data values. The effective result is as if multiple INSERT statements are performed—one for each set of parameter values. ODBC supports both row-wise parameter arrays (each array element holds one set of parameter values) or columnwise parameter arrays (each parameter value is bound to its own individual array, which holds its values).

4.5. Query-Processing Efficiency

In a client/server environment, the network overhead involved in fetching many rows of query results can be very substantial. To cut this overhead, an ODBC driver may support multirow fetches through the ODBC block cursor capability. With a block cursor, each SQLFetch() or SQLFetchScroll() call retrieves multiple rows (termed the current rowset of the cursor) from the data source. The application must bind the returned columns to arrays to hold the multiple rows of fetched data. Either row-wise or columnwise binding of the rowset data is supported, using the same techniques as those used for parameter arrays. In addition, the SQLSetPos() function may be used to establish one of the rows of the rowset as the current row for positioned update and delete operations.

ODBC bookmarks provide a different efficiency boost for an application program that needs to operate on retrieved rows of data. An ODBC bookmark is a database- independent unique row-id for SQL operations. (A driver may actually use primary keys or DBMS-specific row-ids or other methods to support bookmarks, but it is transparent to the application program.) When bookmarks are enabled, the bookmark (row-id) is returned for each row of query results. The bookmark can be used with scrolling cursors to return to a particular row. Additionally, it can be used to perform a positioned update or delete based on a bookmark.

Bookmarks can also be used to determine if a particular row retrieved by two different queries is, in fact, the same row or two different rows with the same data values. Bookmarks can make some operations much more efficient (for example, performing positioned updates via a bookmark rather than respecifying a complex search condition to identify the row). However, there can be substantial overhead for some DBMS brands and ODBC drivers in maintaining the bookmark information, so this trade-off must be considered carefully.

ODBC bookmarks form the basis for ODBC bulk operations, another efficiency-related feature. The SQLBulkOperations() call allows an application program to efficiently update, insert, delete or refetch multiple rows based on their bookmarks. It operates in conjunction with block cursors and works on the rows in the current rowset. The application program places the bookmarks for the rows to be affected into an array, and places into other arrays the values to be inserted or deleted. It then calls SQLBulkOperations() with a function code indicating whether the identified rows are to be updated, deleted, or refetched, or whether a set of new rows is to be added. This call completely bypasses the normal SQL statement syntax for these operations, and because it can operate on multiple rows in a single call, can be a very efficient mechanism for bulk insertion, deletion, or update of data.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.