SQL Data Integrity: Advanced Constraint Capabilities (SQL2)

Primary key and foreign key constraints, uniqueness constraints, and restrictions on missing (NULL) values all provide data integrity checking for very specific structures and situations within a database. The SQL2 standard goes beyond these capabilities to include a much more general capability for specifying and enforcing data integrity constraints. The complete scheme includes four types of constraints:

  • Column constraints. Specified as part of a column definition when a table is created. Conceptually, they restrict the legal values that may appear in the column. Column constraints appear in the individual column definitions within the CREATE TABLE
  •  Domains. A specialized form of column constraints. They provide a limited capability to define new data types within a database. In effect, a domain is one of the predefined database data types plus some additional constraints, which are specified as part of the domain definition. Once a domain is defined and named, the domain name can be used in place of a data type to define new columns. The columns inherit the constraints of the domain. Domains are defined outside of the table and column definitions of the database, using the CREATE DOMAIN statement.
  • Table constraints. Specified as part of the table definition when a table is created. Conceptually, they restrict the legal values that may appear in rows of the table. Table constraints are specified in the CREATE TABLE statement that defines a table. Usually, they appear as a group after the column definitions, but the SQL2 standard allows them to be interspersed with the column definitions.
  •  Assertions. The most general type of SQL2 constraint. Like domains, they are specified outside of the table and column structure of the database. Conceptually, an assertion specifies a relationship among data values that crosses multiple tables within the database.

Each of the four different types of constraints has its own conceptual purpose, and each appears in a different part of the SQL2 statement syntax. However, the distinctions between them are somewhat arbitrary. Any column constraint that appears for an individual column definition can just as easily be specified as a table constraint. Similarly, any table constraint can be specified as an assertion. In practice, it’s probably best to specify each database constraint where it seems to most naturally fit, given the real-world situation that the database is trying to model. Constraints that apply globally to the entire situation (business processes, interrelationships among customers and products, and so on) should appear as assertions. Constraints that apply to a specific type of entity (a customer or an order) should appear as table constraints or column constraints within the appropriate table that describes that type of entity. When the same constraint applies to many different columns in the database that all refer to the same type of entity, then a domain is appropriate.

1. Assertions

Examples of the first three types of constraints have previously appeared in earlier sections of this chapter. An assertion is specified using the SQL2 CREATE ASSERTION statement. Here is an assertion that might be useful in the demo database:

Ensure that an office’s quota target does not exceed the sum of the quotas for its salespeople.

 CREATE ASSERTION quota_valid

CHECK ((OFFICES.QUOTA <= SUM(SALESREPS.QUOTA)) AND

   (SALESREPS.REP_OFFICE = OFFICES.OFFICE))

Because it is an object in the database (like a table or a column), the assertion must be given a name (in this case, it’s quota_valid). The name is used in error messages produced by the DBMS when the assertion is violated. The assertion causing an error may be obvious in a small demo database, but in a large database that might contain dozens or hundreds of assertions, it’s critical to know which of the assertions was violated.

Here is another example of an assertion that might be useful in the sample database:

Ensure that the total of the orders for any customer does not exceed their credit limit:

 CREATE ASSERTION credit_orders

CHECK (CUSTOMER.CREDIT_LIMIT <=

  SELECT SUM(ORDERS.AMOUNT)

FROM ORDERS

  WHERE ORDERS.CUST = CUSTOMER.CUST_NUM)

As these examples show, a SQL2 assertion is defined by a search condition, which is enclosed in parentheses and follows the keyword CHECK. Every time an attempt is made to modify the contents of the database, through an INSERT or UPDATE or DELETE statement, the search condition is checked against the (proposed) modified database contents. If the search condition remains TRUE, the modification is allowed. If the search condition would become untrue, the DBMS does not carry out the proposed modification, and an error code is returned, indicating an assertion violation.

In theory, assertions could cause a very large amount of database processing overhead as they are checked for each statement that might modify the database. In practice, the DBMS will analyze the assertion and determine which tables and columns it involves. Only changes that involve those particular tables or columns will actually trigger the search condition. Nonetheless, assertions should be defined with great care to ensure that they impose a reasonable amount of overhead for the benefit they provide.

2. SQL2 Constraint Types

The types of constraints that can be specified in SQL2, and the role played by each, can be summarized as follows:

  • NOT NULL constraint. The NOT NULL constraint can appear only as a column constraint. It prevents the column from being assigned a NULL
  • PRIMARY KEY constraint. A PRIMARY KEY constraint can appear as a column constraint or a table constraint. If the primary key consists of a single column, the column constraint may be more convenient. If it consists of multiple columns, it should be specified as a table constraint.
  • UNIQUE constraint. A UNIQUE constraint can appear as a column constraint or a table constraint. If the unique values restriction is being enforced only for a single column, the column constraint is the easiest way to specify it. If the unique values restriction applies to a set of two or more columns (that is, the combination of values for those columns must be unique for all rows in the table), then the table constraint form should be used.
  • Referential (FOREIGN KEY) constraint. A referential (FOREIGN KEY) constraint can appear as a column constraint or a table constraint. If the foreign key consists of a single column, the column constraint may be more convenient. If it consists of multiple columns, it should be specified as a table constraint. If a table has many foreign key relationships to other tables, it may be most convenient to gather all of its foreign key constraints together at one place in the table definition, rather than having them scattered throughout the column definitions.
  • CHECK constraint. A CHECK constraint can appear as a column constraint or a table constraint. It is also the only kind of constraint that forms part of the definition of a domain or an assertion. The check constraint is specified as a search condition, like the search condition that appears in the WHERE clause of a database query. The constraint is satisfied if the search condition has a TRUE

Each individual constraint within a database (no matter what its type) may be assigned a constraint name to uniquely identify it from the other constraints. It’s probably not necessary to assign constraint names in a simple database where each constraint is clearly associated with a single table, column, or domain, and where there is little potential for confusion. In a more complex database involving multiple constraints on a single table or column, it can be very useful to be able to identify the individual constraints by name (especially when errors start to occur!). Note that the check constraint in an assertion must have a constraint name; this name effectively becomes the name of the assertion containing the constraint.

3. Deferred Constraint Checking

In their simplest form, the various constraints that are specified within a database are checked every time an attempt is made to change the database contents—that is, during the execution of every attempted INSERT, UPDATE, or DELETE statement. For database systems claiming only Intermediate-Level or Entry-Level conformance to the SQL2 standard, this is the only mode of operation allowed for database constraints. The Full-Level SQL2 standard specifies an additional capability for deferred constraint checking.

When constraint checking is deferred, the constraints are not checked for each individual SQL statement. Instead, constraint checking is held in abeyance until the end of a SQL transaction. (Transaction processing and the associated SQL statements are described in detail in Chapter 12.) When the completion of the transaction is signaled by the SQL COMMIT statement, the DBMS checks the deferred constraints. If all of the constraints are satisfied, then the COMMIT statement can proceed, and the transaction can complete normally. At this point, any changes made to the database during the transaction become permanent. If, however, one or more of the constraints would be violated by the proposed transaction, then the COMMIT statement fails, and the transaction is rolled back—that is, all of the proposed changes to the database are reversed, and the database goes back to its state before the transaction began.

Deferred constraint checking can be very important when several updates to a database must all be made at once to keep the database in a consistent state. For example, suppose the demo database contained this assertion:

Ensure that an office’s quota target is exactly equal to the sum of the quotas for its salespeople.

 CREATE ASSERTION quota_totals

CHECK ((OFFICES.QUOTA = SUM(SALESREPS.QUOTA)) AND

   (SALESREPS.REP_OFFICE = OFFICES.OFFICE))

Without the deferred constraint checking, this constraint would effectively prevent you from ever adding a salesperson to the database. Why? Because to keep the office quota and the salespersons’ quotas in the right relationship, you must both add a new salesperson row with the appropriate quota (using an INSERT statement) and increase the quota for the appropriate office by the same amount (using an UPDATE statement). If you try to perform the INSERT statement on the SALESREPS table first, the OFFICES table will not yet have been updated, the assertion will not be TRUE, and the statement will fail.

Similarly, if you try to perform the UPDATE statement on the OFFICES table first, the SALESREPS table will not yet have been updated, the assertion will not be TRUE, and the statement will fail. The only solution to this dilemma is to defer constraint checking until both statements have completed, and then check to make sure that both operations, taken together, have left the database in a valid state.

The SQL2 deferred constraint mechanism provides for this capability, and much more. Each individual constraint (of all types) within the database can be identified as either DEFERRABLE or NOT DEFERRABLE when it is first created or defined:

  • DEFERRABLE constraint. A DEFERRABLE constraint is one whose checking can be deferred to the end of a transaction. The assertion in the previous example is one that should be deferrable. When updating quotas or adding new salespeople to the database, you certainly want to be able to defer constraint checking, as the example showed.
  • NOT DEFERRABLE constraint. A NOT DEFERRABLE constraint is one whose checking cannot be deferred. A primary key constraint, a uniqueness constraint, and many column check constraints would usually fall into this category. These data integrity checks typically don’t depend on other database interactions. They can and should be checked after every SQL statement that tries to modify the database.

Because it provides the most stringent integrity checking, NOT DEFERRABLE is the default. You must explicitly declare a constraint to be DEFERRABLE if you want to defer its operation. Note also that these constraint attributes define only the deferability of a constraint—that is, whether or not its operation can be deferred. The constraint definition may also specify the initial state of the constraint:

  • INITIALLY IMMEDIATE constraint. An INITIALLY IMMEDIATE constraint is one that starts out as an immediate constraint; that is, it will be checked immediately for each SQL statement.
  • INITIALLY DEFERRED constraint. An INITIALLY DEFERRED constraint is one that starts out as a deferred constraint; that is, its checking will be deferred until the end of a transaction. Of course, this option cannot be specified if the constraint is defined as NOT DEFERRABLE.

The constraint is put into the specified initial state when it is first created. It is also reset into this initial state at the beginning of each transaction. Because it provides the most stringent integrity checking, INITIALLY IMMEDIATE is the default. You must explicitly declare a constraint to be INITIALLY DEFERRED if you want it to automatically start out each transaction in a deferred state.

SQL2 adds one more mechanism to control the immediate or deferred processing of constraints. You can dynamically change the processing of a constraint during database operation using the SET CONSTRAINTS statement. For example, suppose the sample database contains this assertion:

CREATE ASSERTION quota_totals

CHECK ((OFFICES.QUOTA = SUM(SALESREPS.QUOTA)) AND

  (SALESREPS.REP_OFFICE = OFFICES.OFFICE))

  DEFERRABLE INITIALLY IMMEDIATE

The initially immediate checking causes the constraint to be processed, statement by statement, for all normal database processing. For the special transaction that adds a new salesperson to the database, however, you will need to temporarily defer constraint processing. This sequence of statements accomplishes the goal:

SET CONSTRAINTS quota_totals DEFERRED

INSERT INTO SALESREPS (EMPL_NUM, NAME, REP_OFFICE, HIRE_DATE, QUOTA, SALES)

VALUES (:num, :name, :office_num, :date, :amount, 0)

UPDATE OFFICES SET TARGET = TARGET + :amount

WHERE (OFFICE = :office_num)

COMMIT

After the COMMIT statement ends the transaction, the quota_totals constraint is reset back into IMMEDIATE mode because of the INITIALLY IMMEDIATE specification. If there were more work to be done after the UPDATE statement before the end of the transaction, you could manually set the constraint back into IMMEDIATE mode using this statement:

SET CONSTRAINTS quota_totals IMMEDIATE

You can set the same mode for several different constraints by including the constraint names in a comma-separated list:

SET CONSTRAINTS quota_totals, rep_totals IMMEDIATE

Finally, you can set the processing mode for all constraints with a single statement:

SET CONSTRAINTS ALL DEFERRED

The SQL2 capabilities for deferred constraint checking form a very comprehensive facility for managing the integrity of a database. As with many SQL2 capabilities, individual pieces of the SQL2 capability were taken from existing SQL implementations, and individual pieces have found their way into other implementations since the publication of the standard. IBM’s DB2, for example, includes deferred constraint checking capability and supports SQL2-style deferability options. Its SET CONSTRAINTS statement, however, differs from the SQL2 standard. It operates on individual tables in the database, turning on and off the deferral of constraint checking associated with the table contents.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *