The first major DBMS product to emphasize its callable API was SQL Server, in versions from both Sybase and Microsoft. For many years, the SQL Server callable API was the only interface offered by these products. Both Microsoft and Sybase now offer embedded SQL capabilities and have added newer or higher-level callable APIs, but the original SQL Server API remains a very popular way to access these DBMS brands. The SQL Server API also provided the model for much of Microsoft’s ODBC API. SQL Server and its API are also an excellent example of a DBMS designed from the ground up around a client/server architecture. For all of these reasons, it’s useful to begin our exploration of SQL APIs by examining the basic SQL Server API.
The original SQL Server API, which is called the database library or dblib, consists of about 100 functions available to an application program. The API is very comprehensive, but a typical program uses only about a dozen of the function calls, which are summarized in Table 19-1. The other calls provide advanced features, alternative methods of interacting with the DBMS, or single-call versions of features that otherwise would require multiple calls.
1. Basic SQL Server Techniques
A simple SQL Server program that updates a database can use a very small set of dblib calls to do its work. The program in Figure 19-3 implements a simple quota update application for the SALESREPS table in the sample database. It is identical to the program in Figure 17-17, but uses the SQL Server API instead of embedded SQL. The figure illustrates the basic interaction between a program and SQL Server:
- The program prepares a login record, filling in the user name, password, and any other information required to connect to the DBMS.
- The program calls dbopen() to establish a connection to the DBMS. A connection must exist before the program can send SQL statements to SQL Server.
- The program builds a SQL statement in a buffer and calls dbcmd() to pass the SQL text to dblib. Successive calls to dbcmd() add to the previously passed text; there is no requirement that a complete SQL statement be sent in a single dbcmd() call.
- The program calls dbsqlexec(), instructing SQL Server to execute the statement previously passed with dbcmd().
- The program calls dbresults() to determine the success or failure of the statement.
- The program calls dbexit() to close down the connection to SQL Server.
It’s instructive to compare the programs in Figure 19-3 and Figure 17-17 to see the differences between the embedded SQL and the dblib approach:
- The embedded SQL program either implicitly connects to the only available database (as in DB2), or it includes an embedded SQL statement for connection (such as the CONNECT statement specified by the SQL2 standard). The dblib program connects to a particular SQL Server with the dbopen() call.
- The actual SQL UPDATE statement processed by the DBMS is identical in both programs. With embedded SQL, the statement is part of the program’s source code. With dblib, the statement is passed to the API as a sequence of one or more character strings. In fact, the dblib approach more closely resembles the dynamic SQL EXECUTE IMMEDIATE statement than static SQL.
- In the embedded SQL program, host variables provide the link between the SQL statements and the values of program variables. With dblib, the program passes variable values to the DBMS in the same way that it passes program text—as part of a SQL statement string.
- With embedded SQL, errors are returned in the SQLCODE or SQLSTATE field of the SQLCAstructure. With dblib, the dbresults() call retrieves the status of each SQL statement.
Overall, the embedded SQL program in Figure 17-17 is shorter and probably easier to read. However, the program is neither purely C nor purely SQL, and a programmer must be trained in the use of embedded SQL to understand it. The use of host variables means that the interactive and embedded forms of the SQL statement are different. In addition, the embedded SQL program must be processed both by the SQL precompiler and by the C compiler, lengthening the compilation cycle. In contrast, the SQL Server program is a plain vanilla C program, directly acceptable to the C compiler, and does not require special coding techniques.
1.1. Statement Batches
The program in Figure 19-3 sends a single SQL statement to SQL Server and checks its status. If an application program must execute several SQL statements, it can repeat the dbcmd() / dbsqlexec() / dbresults() cycle for each statement. Alternatively, the program can send several statements as a single statement batch to be executed by SQL Server.
Figure 19-4 shows a program that uses a batch of three SQL statements. As in Figure 19-3, the program calls dbcmd() to pass SQL text to dblib. The API simply concatenates the text from each call. Note that it’s the program’s responsibility to include any required spaces or punctuation in the passed text. SQL Server does not begin executing the statements until the program calls dbsqlexec(). In this example, three statements have been sent to SQL Server, so the program calls dbresults() three times in succession. Each call to dbresults() advances the API to the results of the next statement in the batch and tells the program whether the statement succeeded or failed.
In the program shown in Figure 19-4, the programmer knows in advance that three statements are in the batch, and the programmer can code three corresponding calls to dbresults(). If the number of statements in the batch is not known in advance, the program can call dbresults() repeatedly until it receives the error code NO_MORE_ RESULTS. The program excerpt in Figure 19-5 illustrates this technique.
1.2. Error Handling
The value returned by the dbresults() function tells the program whether the corresponding statement in the statement batch succeeded or failed. To get more detailed information about a failure, your program must provide its own message-handling function. The dblib software automatically calls the message-handling function when SQL Server encounters an error while executing SQL statements. Note that dblib calls the message-handling function during its processing of the dbsqlexec() or dbresults() function calls, before it returns to your program (i.e., it is a callback function, called back by the SQL Server software). This allows the message-handling function to do its own error processing.
Figure 19-6 shows an excerpt from a SQL Server program that includes a messagehandling function called msg_rtn(). When the program begins, it activates the message-handling function by calling msghandle(). Suppose an error occurs later, while SQL Server is processing the DELETE statement. When the program calls dbsqlexec() or dbresults() and dblib receives the error message from SQL Server, it upcalls the msg_rtn() routine in the program, passing it five parameters:
- dbproc. The connection on which the error occurred
- msgno. The SQL Server error number identifying the error
- msgstate. A parameter providing information about the error context
- severity. A number indicating the seriousness of the error
- msgtext. An error message corresponding to msgno
The msg_rtn() function in this program handles the message by printing it and saving the error number in a program variable for use later in the program. When the message-handling function returns to dblib (which called it), dblib completes its own processing and then returns to the program with a FAIL status. The program can detect this return value and perform further error processing, if appropriate.
The program excerpt in the figure actually presents a simplified view of SQL Server error handling. In addition to SQL statement errors detected by SQL Server, errors can also occur within the dblib API itself. For example, if the network connection to the SQL Server is lost, a dblib call may time out waiting for a response from SQL Server, resulting in an error. The API handles these errors by upcalling a separate error-handling function, which operates much like the message-handling function described here.
A comparison of Figure 19-6 with Figures 17-10 and 17-13 illustrates the differences in error-handling techniques between dblib and embedded SQL:
- In embedded SQL, the SQLCA structure is used to signal errors and warnings to the program. SQL Server communicates errors and warnings by upcalling special functions within the application program and returning a failure status for the API function that encountered the error.
- In embedded SQL, error processing is synchronous. The embedded SQL statement fails, control returns to the program, and the SQLCODE or SQLSTATE value is tested. SQL Server error processing is asynchronous. When an API call fails, SQL Server calls the application program’s error-handling or messagehandling function during the API call. It returns to the application program with an error status later.
- Embedded SQL has only a single type of error and a single mechanism for reporting it. The SQL Server scheme has two types of errors and two parallel mechanisms.
In summary, error handling in embedded SQL is simple and straightforward, but the application program can make only a limited number of responses when an error occurs. A SQL Server program has more flexibility in handling errors. However, the upcall scheme used by dblib is more sophisticated, and while it is familiar to systems programmers, it may be unfamiliar to application programmers.
2. SQL Server Queries
The SQL Server technique for handling programmatic queries is very similar to its technique for handling other SQL statements. To perform a query, a program sends a SELECT statement to SQL Server and uses dblib to retrieve the query results row by row. The program in Figure 19-7 illustrates the SQL Server query-processing technique.
- The program uses the dbcmd() and dbsqlexec() calls to pass a SELECT statement to SQL Server and request its execution.
- When the program calls dbresults() for the SELECT statement, dblib returns the completion status for the query and also makes the query results available for processing.
- The program calls dbbind() once for each column of query results, telling dblib where it should return the data for that particular column. The arguments to dbbind() indicate the column number, the buffer to receive its data, the size of the buffer, and the expected data type.
- The program loops, calling dbnextrow() repeatedly to obtain the rows of query results. The API places the returned data into the data areas indicated in the previous dbbind() calls.
- When no more rows of query results are available, the dbnextrow() call returns the value NO_MORE_ROWS. If more statements were in the statement batch following the SELECT statement, the program could call dbresults() to advance to the next statement.
Two of the dblib calls in Figure 19-7, dbbind() and dbnextrow(), support processing of the SQL Server query results. The dbbind() call sets up a one-to-one correspondence between each column of query results and the program variable that is to receive the retrieved data. This process is called binding the column. In the figure, the first column (NAME) is bound to a 16-byte character array and will be returned as a NULL-terminated string. The second and third columns, QUOTA and SALES, are both bound to floating point numbers. It is the programmer’s responsibility to make sure that the data type of each column of query results is compatible with the data type of the program variable to which it is bound.
Once again, it is useful to compare the SQL Server query processing in Figure 19-7 with the embedded SQL queries in Figure 17-20 and Figure 17-23:
- Embedded SQL has two different query-processing techniques—one for single-row queries (singleton SELECT) and one for multirow queries (cursors). The dblib API uses a single technique, regardless of the number of rows of query results.
- To specify the query, embedded SQL replaces the interactive SELECT statement with the singleton SELECT statement or the DECLARE CURSOR statement. With SQL Server, the SELECT statement sent by the program is identical to the interactive SELECT statement for the query.
- With embedded SQL, the host variables that receive the query results are named in the INTO clause of the singleton SELECT or the FETCH statement. With SQL Server, the variables to receive query results are specified in the dbbind() calls.
- With embedded SQL, row-by-row access to query results is provided by special-purpose embedded SQL statements (OPEN, FETCH, and CLOSE).
With SQL Server, access to query results is through dblib function calls (dbresults() and dbnextrow()), which keep the SQL language itself more streamlined.
Because of its relative simplicity and its similarity to the interactive SQL interface, many programmers find the SQL Server interface easier to use for query processing than the embedded SQL interface.
3. Retrieving null Values
The dbnextrow() and dbbind() calls shown in Figure 19-7 provide a simple way to retrieve query results, but they do not support NULL values. When a row retrieved by dbnextrow() includes a column with a NULL value, SQL Server replaces the NULL with a NULL substitution value. By default, SQL Server uses zero as a substitution value for numeric data types, a string of blanks for fixed-length strings, and an empty string for variable-length strings. The application program can change the default value for any data type by calling the API function dbsetnull().
In the program shown in Figure 19-7, if one of the offices had a NULL value in its QUOTA column, the dbnextrow() call for that office would retrieve a zero into the quota_value variable. Note that the program cannot tell from the retrieved data whether the QUOTA column for the row really has a zero value, or whether it is NULL.
In some applications, the use of substitution values is acceptable, but in others, it is important to be able to detect NULL values. These latter applications must use an alternative scheme for retrieving query results, described in the next section.
4. Retrieval Using Pointers
With the basic SQL Server data retrieval technique, the dbnextrow() call copies the data value for each column into one of your program’s variables. If there are many rows of query results or many long columns of text data, copying the data into your program’s data areas can create a significant overhead. In addition, the dbnextrow() call lacks a mechanism for returning NULL values to your program.
To solve these two problems, dblib offers an alternate method of retrieving query results. Figure 19-8 shows the program excerpt from Figure 19-7, rewritten to use this alternate method:
- The program sends the query to SQL Server and uses dbresults() to access the results, as it does for any SQL statement. However, the program does not call dbbind() to bind the columns of query results to program variables.
- The program calls dbnextrow() to advance, row by row, through the query results.
- For each column of each row, the program calls dbdata() to obtain a pointer to the data value for the column. The pointer points to a location within dblib’s internal buffers.
- If a column contains variable-length data, such as a VARCHAR data item, the program calls dbdatlen() to find out the length of the data item.
- If a column has a NULL value, the dbdata() function returns a null pointer (0), and dbdatlen() returns 0 as the length of the item. These return values give the program a way to detect and respond to NULL values in the query results.
The program in Figure 19-8 is more cumbersome than the one in Figure 19-7. In general, it’s easier to use the dbbind() function than the dbdata() approach, unless your program needs to handle NULL values or will be handling a large volume of query results.
5. Random Row Retrieval
A program normally processes SQL Server query results by moving through them sequentially using the dbnextrow() call. For browsing applications, dblib also provides limited random access to the rows of query results. Your program must explicitly enable random row access by turning on a dblib option. The dbgetrow() call can then be used to retrieve a row by its row number.
To support random row retrieval, dblib stores the rows of query results in an internal buffer. If the query results fit entirely within the dblib buffer, dbgetrow() supports random retrieval of any row. If the query results exceed the size of the buffer, only the initial rows of query results are stored. The program can randomly retrieve these rows, but a dbnextrow() call that attempts to retrieve a row past the end of the buffer returns the special BUF_FULL error condition. The program must then discard some of the saved rows from the buffer, using the dbclrbuf() call, to make room for the new row. Once the rows are discarded, they cannot be reretrieved with the dbgetrow() function. Thus, dblib supports random retrieval of query results within a limited window, dictated by the size of the row buffer, as shown in Figure 19-9. Your program can specify the size of the dblib row buffer by calling the dblib routine dbsetopt().
The random access provided by dbgetrow() is similar to the scroll cursors supported by several DBMS products and specified by the SQL2 standard. In both cases, random retrieval by row number is supported. However, a scroll cursor is a true pointer into the entire set of query results; it can range from the first to the last row, even if the query results contain thousands of rows. By contrast, the dbgetrow() function provides random access only within a limited window. This is adequate for limited browsing applications but cannot easily be extended to large queries.
6. Positioned Updates
In an embedded SQL program, a cursor provides a direct, intimate link between the program and the DBMS query processing. The program communicates with the DBMS row by row as it uses the FETCH statement to retrieve query results. If the query is a simple single-table query, the DBMS can maintain a direct correspondence between the current row of query results and the corresponding row within the database. Using this correspondence, the program can use the positioned update statements (UPDATE… WHERE CURRENT OF and DELETE. WHERE CURRENT OF) to modify or delete the current row of query results.
SQL Server query processing uses a much more detached, asynchronous connection between the program and the DBMS. In response to a statement batch containing one or more SELECT statements, SQL Server sends the query results back to the dblib software, which manages them. Row-by-row retrieval is handled by the dblib API calls, not by SQL language statements. As a result, early versions of SQL Server could not support positioned updates because its notion of a current row applied to query results within the dblib API, not to rows of the actual database tables.
Later versions of SQL Server (and Sybase) added complete support for standard SQL cursors, with their associated DECLARE/OPEN/FETCH/CLOSE SQL statements. Cursors actually operate within Transact-SQL stored procedures, and the action of the FETCH statement is to fetch data from the database into the stored procedure for processing—not to actually retrieve it into the application program that called the stored procedure. Stored procedures and their operation within various popular SQL DBMS products are discussed in Chapter 20.
7. Dynamic Queries
In the program examples thus far in this chapter, the queries to be performed were known in advance. The columns of query results could be bound to program variables by explicit dbbind() calls hard-coded in the program. Most programs that use SQL Server can be written using this technique. (This static column binding corresponds to the fixed list of host variables used in the static SQL FETCH statement in standard embedded SQL, described in Chapter 17.)
If the query to be carried out by a program is not known at the time the program is written, the program cannot include hard-coded dbbind() calls. Instead, the program must ask dblib for a description of each column of query results, using special API functions. The program can then bind the columns on the fly to data areas that it allocates at runtime. (This dynamic column binding corresponds to the use of the dynamic SQL DBNUMCOLS() statement and SQLDA, in dynamic embedded SQL, as described in Chapter 18.)
Figure 19-10 shows an interactive query program that illustrates the dblib technique for handling dynamic queries. The program accepts a table name entered by the user and then prompts the user to choose which columns are to be retrieved from the table. As the user selects the columns, the program constructs a SELECT statement and then uses these steps to execute the SELECT statement and display the data from the selected columns:
- The program passes the generated SELECT statement to SQL Server using the dbcmd() call, requests its execution with the dbsqlexec() call, and calls dbresults() to advance the API to the query results, as it does for all queries.
- The program calls dbnumcols() to find out how many columns of query results were produced by the SELECT statement.
- For each column, the program calls dbcolname() to find out the name of the column, calls dbcoltype() to find out its data type, and calls dbcollen() to find out its maximum length.
- The program allocates a buffer to receive each column of query results and calls dbbind() to bind each column to its buffer.
- When all columns have been bound, the program calls dbnextrow() repeatedly to retrieve each row of query results.
The dblib-based program in Figure 19-10 performs exactly the same function as the dynamic embedded SQL program in Figure 18-9. It’s instructive to compare the two programs and the techniques they use:
- For both embedded SQL and dblib, the program builds a SELECT statement in its buffers and submits it to the DBMS for processing. With dynamic SQL, the special PREPARE statement handles this task; with the SQL Server API, the standard dbcmd() and dbsqlexec() functions are used.
- For both interfaces, the program must request a description of the columns of query results from the DBMS. With dynamic SQL, the special DBNUMCOLS() statement handles this task, and the description is returned in a SQLDA data structure. With dblib, the description is obtained by calling API functions. Note that the program in Figure 19-10 maintains its own arrays to keep track of the column information.
- For both interfaces, the program must allocate buffers to receive the query results and must bind individual columns to those buffer locations. With dynamic SQL, the program binds columns by placing the buffer addresses into the SQLVAR structures in the SQLDA. With SQL Server, the program uses the dbbind() function to bind the columns.
- For both interfaces, the query results are returned into the program’s buffers, row by row. With dynamic SQL, the program retrieves a row of query results using a special version of the FETCH statement that specifies the SQLDA. With SQL Server, the program calls dbnextrow() to retrieve a row.
Overall, the strategy used to handle dynamic queries is very similar for both interfaces. The dynamic SQL technique uses special statements and data structures that are unique to dynamic SQL; they are quite different from the techniques used for static SQL queries. In contrast, the dblib techniques for dynamic queries are basically the same as those used for all other queries. The only added features are the dblib functions that return information about the columns of query results. This tends to make the callable API approach easier to understand for the less-experienced SQL programmer.
Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.