Simple Embedded SQL Statements

The simplest SQL statements to embed in a program are those that are self-contained and do not produce any query results. For example, consider this interactive SQL statement:

Delete all salespeople with sales under $150,000.

 DELETE FROM SALESREPS

   WHERE SALES < 150000.00

Figures 17-4, 17-5, and 17-6 show three programs that perform the same task as this interactive SQL statement, using embedded SQL. The program in Figure 17-4 is written in C, the program in Figure 17-5 is written in COBOL, and the program in Figure 17-6 is written in FORTRAN. Although the programs are extremely simple, they illustrate the most basic features of embedded SQL:

  • The embedded SQL statements appear in the midst of host programming language statements. It usually doesn’t matter whether the SQL statement is written in uppercase or lowercase; the most common practice is to follow the style of the host language.
  • Every embedded SQL statement begins with an introducer that flags it as a SQL statement. The IBM SQL products use the introducer EXEC SQL for most host languages, and the ANSI/ISO SQL2 standard specifies it as well. Some embedded SQL products still support other introducers, for backward compatibility with their earlier versions.
  • If an embedded SQL statement extends over multiple lines, the host language strategy for statement continuation is used. For COBOL, PL/I, and C programs, no special continuation character is required. For FORTRAN programs, the second and subsequent lines of the statement must have a continuation character in column 6.
  • Every embedded SQL statement ends with a terminator that signals the end of the SQL statement. The terminator varies with the style of the host language. In COBOL, the terminator is the string END-EXEC., which ends in a period like other COBOL statements. For PL/I and C, the terminator is a semicolon (;), which is also the statement termination character in those languages. In FORTRAN, the embedded SQL statement ends when no more continuation lines are indicated.

The embedding technique shown in the three figures works for any SQL statement that (a) does not depend on the values of host language variables for its execution and (b) does not retrieve data from the database. For example, the C program in Figure 17-7 creates a new REGIONS table and inserts two rows into it, using exactly the same embedded SQL features as the program in Figure 17-4. For consistency, all of the remaining program examples in the book will use the C programming language, except when a particular host language feature is being illustrated.

1. Declaring Tables

In the IBM SQL products, the embedded DECLARE TABLE statement, shown in Figure 17-8, declares a table that will be referenced by one or more embedded SQL statements in your program. This is an optional statement that aids the precompiler in its task of parsing and validating the embedded SQL statements. By using the DECLARE TABLE statement, your program explicitly specifies its assumptions about the columns in the table and their data types and sizes. The precompiler checks the table and column references in your program to make sure they conform to your table declaration.

The programs in Figures 17-4, 17-5, and 17-6 all use the DECLARE TABLE statement. It’s important to note that the statement appears purely for documentation purposes and for the use of the precompiler. It is not an executable statement, and you do not need to explicitly declare tables before referring to them in embedded DML or DDL statements. However, using the DECLARE TABLE statement does make your program more self-documenting and simpler to maintain. The IBM-developed SQL products all support the DECLARE TABLE statement, but most other SQL products do not support it, and their precompilers will generate an error message if you use it.

2. Error Handling

When you type an interactive SQL statement that causes an error, the interactive SQL program displays an error message, aborts the statement, and prompts you to type a new statement. In embedded SQL, error handling becomes the responsibility of the application program. Actually, embedded SQL statements can produce two distinct types of errors:

  • Compile-time errors. Misplaced commas, misspelled SQL keywords, and similar errors in embedded SQL statements are detected by the SQL precompiler and reported to the programmer. The programmer can fix the errors and recompile the application program.
  • Runtime errors. An attempt to insert an invalid data value or lack of permission to update a table can be detected only at runtime. Errors such as these must be detected and handled by the application program.

In embedded SQL programs, the DBMS reports runtime errors to the application program through a returned error code. If an error is detected, a further description of the error and other information about the statement just executed is available through additional diagnostic information. The earliest IBM-embedded SQL implementations defined an error-reporting mechanism that was adopted, with variations, by most of the major DBMS vendors. The central part of this scheme—an error status variable named SQLCODE—was also defined in the original ANSI/ISO SQL standard. The SQL2 standard, published in 1992, defined an entirely new, parallel error-reporting mechanism, built around an error status variable named SQLSTATE. These mechanisms are described in the next two sections.

3. Error Handling with sqlcode

Under this scheme, pioneered by the earliest IBM products, the DBMS communicates status information to the embedded SQL program through an area of program storage called the SQL Communications Area, or SQLCA. The SQLCA is a data structure that contains error variables and status indicators. By examining the SQLCA, the application program can determine the success or failure of its embedded SQL statements and act accordingly.

Notice in Figures 17-4, 17-5, 17-6, and 17-7 that the first embedded SQL statement in the program is INCLUDE SQLCA. This statement tells the SQL precompiler to include a SQL Communications Area in this program. The specific contents of the SQLCA vary slightly from one brand of DBMS to another, but the SQLCA always provides the same type of information. Figure 17-9 shows the definition of the SQLCA used by the IBM databases. The most important part of the SQLCA, the SQLCODE variable, is supported by all of the major embedded SQL products and was specified by the ANSI/ISO SQL1 standard.

As the DBMS executes each embedded SQL statement, it sets the value of the variable SQLCODE in the SQLCA to indicate the completion status of the statement:

  • A SQLCODE of zero indicates successful completion of the statement, without any errors or warnings.
  • A negative SQLCODE value indicates a serious error that prevented the statement from executing correctly. For example, an attempt to update a read-only view would produce a negative SQLCODE value. A separate negative value is assigned to each runtime error that can occur.
  • A positive SQLCODE value indicates a warning condition. For example, truncation or rounding of a data item retrieved by the program would produce a warning. A separate positive value is assigned to each runtime warning that can occur. The most common warning, with a value of +100 in most implementations and in the SQL1 standard, is the out-of-data warning returned when a program tries to retrieve the next row of query results and no more rows are left to retrieve.

Because every executable embedded SQL statement can potentially generate an error, a well-written program will check the SQLCODE value after every executable embedded SQL statement. Figure 17-10 shows a C program excerpt that checks the SQLCODE value. Figure 17-11 shows a similar excerpt from a COBOL program.

4. Error Handling with sqlstate

By the time the SQL2 standard was being written, virtually all commercial SQL products were using the SQLCODE variable to report error conditions in an embedded SQL program. However, there was no standardization of the error numbers used by the different products to report the same or similar error conditions. Further, because of the significant differences among SQL implementations permitted by the SQL1 standard, considerable differences in the errors could occur from one implementation to another. Finally, the definition of the SQLCA varied in significant ways from one DBMS brand to another, and all of the major brands had a large installed base of applications that would be broken by any change to their SQLCA structure.

Instead of tackling the impossible task of getting all of the DBMS vendors to agree to change their SQLCODE values to some standard, the writers of the SQL2 standard took a different approach. They included the SQLCODE error value, but identified it as a deprecated feature, meaning that it was considered obsolete and would be removed from the standard at some future time. To take its place, they introduced a new error variable, called SQLSTATE. The standard also specifies, in detail, the error conditions that can be reported through the SQLSTATE variable, and the error code assigned to each error. To conform to the SQL2 standard, a SQL product must report errors using both the SQLCODE and SQLSTATE error variables. In this way, existing programs that use SQLCODE will still function, but new programs can be written to use the standardized SQLSTATE error codes.

The SQLSTATE variable consists of two parts:

  • A two-character error class that identifies the general classification of the error (such as a connection error, an invalid data error, or a warning).
  • A three-character error subclass that identifies a specific type of error within a general error class. For example, within the invalid data class, the error subclass might identify a divide by zero error, an invalid numeric value error, or invalid datetime data.

Errors specified in the SQL2 standard have an error class code that begins with a digit from zero to four (inclusive) or a letter between A and H (inclusive). For example, data errors are indicated by error class 22. A violation of an integrity constraint (such as a foreign key definition) is indicated by error class 23. A transaction rollback is indicated by error class 40. Within each error class, the standard subclass codes also follow the same initial number/letter restrictions. For example, within error class 40 (transaction rollback), the subclass codes are 001 for serialization failure (that is, your program was chosen as the loser in a deadlock), 002 for an integrity constraint violation, and 003 for errors where the completion status of the SQL statement is unknown (for example, when a network connection breaks or a server crashes before the statement completes). Figure 17-12 shows the same C program as Figure 17-10, but uses the SQLSTATE variable for error checking instead of SQLCODE.

The standard specifically reserves error class codes that begin with digits from five to nine (inclusive) and letters between I and Z (inclusive) as implementation-specific errors that are not standardized. While this allows differences among DBMS brands to continue, all of the most common errors caused by SQL statements are included in the standardized error class codes. As commercial DBMS implementations move to support the SQLSTATE variable, one of the most troublesome incompatibilities between different SQL products is gradually being eliminated.

The SQL2 standard provides additional error and diagnostics information through a new GET DIAGNOSTICS statement, shown in Figure 17-13. The statement allows an embedded SQL program to retrieve one or more items of information about the SQL statement just executed, or about an error condition that was just raised. Support for the GET DIAGNOSTICS statement is required for Intermediate SQL or Full SQL conformance to the standard but is not required or allowed in Entry SQL. Figure 17-14 shows a C program excerpt like that in Figure 17-12, extended to include the GET DIAGNOSTICS statement.

5. The whenever Statement

It quickly becomes tedious for a programmer to write programs that explicitly check the SQLCODE value after each embedded SQL statement. To simplify error handling, embedded SQL supports the WHENEVER statement, shown in Figure 17-15. The WHENEVER statement is a directive to the SQL precompiler, not an executable statement. It tells the precompiler to automatically generate error-handling code following every executable embedded SQL statement and specifies what the generated code should do.

You can use the WHENEVER statement to tell the precompiler how to handle three different exception conditions:

  • WHENEVER SQLERROR tells the precompiler to generate code to handle errors (negative SQLCODEs).
  • WHENEVER SQLWARNING tells the precompiler to generate code to handle warnings (positive SQLCODEs).
  • WHENEVER NOT FOUND tells the precompiler to generate code that handles a particular warning—the warning generated by the DBMS when your program tries to retrieve query results when no more are remaining. This use of the WHENEVER statement is specific to the singleton SELECT and the FETCH statements, and is described in the section “Single-Row Queries.”

Note that the SQL2 standard does not specify the SQLWARNING form of the WHENEVER statement, but most commercial SQL products support it.

For any of these three conditions, you can tell the precompiler to generate code that takes one of two actions:

  • WHENEVER/GOTO tells the precompiler to generate a branch to the specified label, which must be a statement label or statement number in the program.
  • WHENEVER/CONTINUE tells the precompiler to let the program’s flow of control proceed to the next host language statement.

The WHENEVER statement is a directive to the precompiler, and its effect can be superseded by another WHENEVER statement appearing later in the program text.

Figure 17-16 shows a program excerpt with three WHENEVER statements and four executable SQL statements. In this program, an error in either of the two DELETE statements results in a branch to errorl because of the first WHENEVER statement. An error in the embedded UPDATE statement flows directly into the following statements of the program. An error in the embedded INSERT statement results in a branch to error2. As this example shows, the main use of the WHENEVER/CONTINUE form of the statement is to cancel the effect of a previous WHENEVER statement.

The WHENEVER statement makes embedded SQL error handling much simpler, and it is more common for an application program to use it than to check SQLCODE or SQLSTATE directly. Remember, however, that after a WHENEVER/GOTO statement appears, the precompiler will generate a test and a branch to the specified label for every embedded SQL statement that follows it. You must arrange your program so that the specified label is a valid target for branching from these embedded SQL statements, or use another WHENEVER statement to specify a different destination or cancel the effects of the WHENEVER/GOTO.

6. Using Host Variables

The embedded SQL programs in the previous figures don’t provide any real interaction between the programming statements and the embedded SQL statements. In most applications, you will want to use the value of one or more program variables in the embedded SQL statements. For example, suppose you wanted to write a program to adjust all sales quotas up or down by some dollar amount. The program should prompt the user for the amount and then use an embedded UPDATE statement to change the QUOTA column in the SALESREPS table.

Embedded SQL supports this capability through the use of host variables. A host variable is a program variable declared in the host language (for example, a COBOL or C variable) that is referenced in an embedded SQL statement. To identify the host variable, the variable name is prefixed by a colon (:) when it appears in an embedded SQL statement. The colon allows the precompiler to distinguish easily between host variables and database objects (such as tables or columns) that may have the same name.

Figure 17-17 shows a C program that implements the quota adjustment application using a host variable. The program prompts the user for the adjustment amount and stores the entered value in the variable named amount. This host variable is referenced in the embedded UPDATE statement. Conceptually, when the UPDATE statement is executed, the value of the amount variable is obtained, and that value is substituted for the host variable in the SQL statement. For example, if you enter the amount 500 in response to the prompt, the DBMS effectively executes this UPDATE statement:

 exec sql update salesreps

set quota = quota + 500;

A host variable can appear in an embedded SQL statement wherever a constant can appear. In particular, a host variable can be used in an assignment expression:

exec sql update salesreps

    set quota = quota + :amount;

A host variable can appear in a search condition:

 exec sql delete from salesreps

where quota < :amount;

A host variable can also be used in the VALUES clause of an INSERT statement:

 exec sql insert into salesreps (empl_num, name, quota)

values (116, ‘Bill Roberts’, :amount);

In each case, note that the host variable is part of the program’s input to the DBMS; it forms part of the SQL statement submitted to the DBMS for execution. Later, in the section “Data Retrieval in Embedded SQL,” you will see how host variables are also used to receive output from the DBMS; they receive query results returned from the DBMS to the program.

Note that a host variable cannot be used instead of a SQL identifier. This attempted use of the host variable colname is illegal:

char *colname = “quota”;

exec sql insert into salesreps (empl_num, name, :colname)

values (116, ‘Bill Roberts’, 0.00);

7. Declaring Host Variables

When you use a host variable in an embedded SQL statement, you must declare the variable using the normal method for declaring variables in the host programming language. For example, in Figure 17-17, the host variable amount is declared using the normal C language syntax (float amount;). When the precompiler processes the source code for the program, it notes the name of each variable it encounters, along with its data type and size. The precompiler uses this information to generate correct code later when it encounters a use of the variable as a host variable in a SQL statement.

The two embedded SQL statements BEGIN DECLARE SECTION and END DECLARE SECTION bracket the host variable declarations, as shown in Figure 17-17. These two statements are unique to embedded SQL, and they are not executable. They are directives to the precompiler, telling it when it must pay attention to variable declarations and when it can ignore them.

In a simple embedded SQL program, it may be possible to gather together all of the host variable declarations in one declare section. Usually, however, the host variables must be declared at various points within the program, especially in block-structured
languages such as C, Pascal, and PL/I. In this case, each declaration of host variables must be bracketed with a BEGIN DECLARE SECTION/END DECLARE SECTION statement pair.

The BEGIN DECLARE SECTION and END DECLARE SECTION statements are relatively new to the embedded SQL language. They are specified in the ANSI/ISO SQL standard, and DB2 requires them in its newer embedded SQL implementations. However, DB2 and many other DBMS brands did not historically require declare sections, and some SQL precompilers do not yet support the BEGIN DECLARE SECTION and END DECLARE SECTION statements. In this case, the precompiler scans and processes all variable declarations in the host program.

When you use a host variable, the precompiler may limit your flexibility in declaring the variable in the host programming language. For example, consider the following C language source code:

#define BIGBUFSIZE 256

.

.

.

exec sql begin declare section;

char bigbuffer[BIGBUFSIZE+1];

exec sql end declare section;

This is a valid C declaration of the variable bigbuffer. However, if you try to use bigbuffer as a host variable in an embedded SQL statement like this:

exec sql update salesreps

set quota = 300000

where name = :bigbuffer;

many precompilers will generate an error message, complaining about an illegal declaration of bigbuffer. The problem is that some precompilers don’t recognize symbolic constants like BIGBUFSIZE. This is just one example of the special considerations that apply when using embedded SQL and a precompiler. Fortunately, the precompilers offered by the major DBMS vendors are being improved steadily, and the number of special case problems like this one is decreasing.

8. Host Variables and Data Types

The data types supported by a SQL-based DBMS and the data types supported by a programming language such as C or FORTRAN are often quite different. These differences impact host variables because they play a dual role. On the one hand, a host variable is a program variable, declared using the data types of the programming language and manipulated by programming language statements. On the other hand, a host variable is used in embedded SQL statements to contain database data.

Consider the four embedded UPDATE statements in Figure 17-18. In the first UPDATE statement, the MANAGER column has an INTEGER data type, so hostvarl should be declared as a C integer variable. In the second statement, the NAME column has a VARCHAR data type, so hostvar2 should contain string data. The program should declare hostvar2 as an array of C character data, and most DBMS products will expect the data in the array to be terminated by a null character (0). In the third UPDATE statement, the QUOTA column has a MONEY data type. There is no corresponding data type in C, and C does not support a packed decimal data type. For most DBMS brands, you can declare hostvar3 as a C floating point variable, and the DBMS will automatically translate the floating point value into the DBMS MONEY format. Finally, in the fourth UPDATE statement, the HIRE_DATE column has a DATE data type in the database. For most DBMS brands, you should declare hostvar4 as an array of C character data and fill the array with a text form of the date acceptable to the DBMS.

As Figure 17-18 shows, the data types of host variables must be chosen carefully to match their intended use in embedded SQL statements. Table 17-1 shows the SQL data types specified in the ANSI/ISO SQL2 standard and the corresponding data types used in four of the most popular embedded SQL programming languages, as specified in the standard. The standard specifies data type correspondences and embedded SQL rules for the Ada, C, COBOL, FORTRAN, MUMPS, Pascal, and PL/I languages.

Note, however, that in many cases, there is not a one-to-one correspondence between data types. In addition, each brand of DBMS has its own data type idiosyncrasies and its own rules for data type conversion when using host variables. Before counting on a specific data conversion behavior, consult the documentation for your particular DBMS brand and carefully read the description for the particular programming language you are using.

9. Host Variables and null Values

Most programming languages do not provide SQL-style support for unknown or missing values. A variable in COBOL, C, or FORTRAN, for example, always has a value. There is no concept of the value being NULL or missing. This causes a problem when you want to store NULL values in the database or retrieve NULL values from the database using programmatic SQL. Embedded SQL solves this problem by allowing each host variable to have a companion host indicator variable. In an embedded SQL statement, the host variable and the indicator variable together specify a single SQL-style value, as follows:

  • An indicator value of zero means that the host variable contains a valid value and that this value is to be used.
  • A negative indicator value means that the host variable should be assumed to have a NULL value; the actual value of the host variable is irrelevant and should be disregarded.

  • A positive indicator value means that the host variable contains a valid value, which may have been rounded off or truncated. This situation occurs only when data is retrieved from the database, and is described later in the section “Retrieving NULL Values.”

When you specify a host variable in an embedded SQL statement, you can follow it immediately with the name of the corresponding indicator variable. Both variable names are preceded by a colon. Here is an embedded UPDATE statement that uses the host variable amount with the companion indicator variable amount_ind:

exec sql update salesreps

  set quota = :amount :amount_ind, sales = :amount2

where quota < 20000.00;

If amount_ind has a nonnegative value when the UPDATE statement is executed, the DBMS treats the statement as if it read:

exec sql update salesreps

   set quota = :amount, sales = :amount2

 where quota < 20000.00;

If amount_ind has a negative value when the UPDATE statement is executed, the DBMS treats the statement as if it read:

exec sql update salesreps

   set quota = NULL, sales = :amount2

 where quota < 20000.00;

A host variable/indicator variable pair can appear in the assignment clause of an embedded UPDATE statement (as shown here) or in the values clause of an embedded INSERT statement. You cannot use an indicator variable in a search condition, so this embedded SQL statement is illegal:

 exec sql delete from salesreps

where quota = :amount :amount_ind;

This prohibition exists for the same reason that the NULL keyword is not allowed in the search condition—it makes no sense to test whether QUOTA and NULL are equal, because the answer will always be NULL (unknown). Instead of using the indicator variable, you must use an explicit IS NULL test. This pair of embedded SQL statements accomplishes the intended task of the preceding illegal statement:

if (amount_ind < 0) {

exec sql delete from salesreps

where quota is null;

}

else {

exec sql delete from salesreps

where quota = :amount;

}

Indicator variables are especially useful when you are retrieving data from the database into your program and the retrieved data values may be NULL. This use of indicator variables is described later in the section “Retrieving NULL Values.”

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *