Codd’s 12 Rules*

In his 1985 Computerworld article, Ted Codd presented 12 rules that a database must obey if it is to be considered truly relational. Codd’s 12 rules, shown in the following list, have since become a semiofficial definition of a relational database. The rules come out of Codd’s theoretical work on the relational model and actually represent more of an ideal goal than a definition of a relational database.

  1. Information rule. All information in a relational database is represented explicitly at the logical level and in exactly one way—by values in tables.
  2. Guaranteed access rule. Each and every datum (atomic value) in a relational database is guaranteed to be logically accessible by resorting to a combination of table name, primary key value, and column name.
  3. Systematic treatment of NULL values. NULL values (distinct from an empty character string or a string of blank characters and distinct from zero or any other number) are supported in a fully relational DBMS for representing missing information and inapplicable information in a systematic way, independent of the data type.
  4. Dynamic online catalog based on the relational model. The database description is represented at the logical level in the same way as ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to the regular data.
  5. Comprehensive data sublanguage rule. A relational system may support several languages and various modes of terminal use (for example, the fill-in- the-blanks mode). However, there must be at least one language whose statements are expressible, per some well-defined syntax, as character strings, and that is comprehensive in supporting all of the following items:
    • Data definition
    • View definition
    • Data manipulation (interactive and by program)
    • Integrity constraints
    • Authorization
    • Transaction boundaries (begin, commit, and rollback)
  1. View updating rule. All views that are theoretically updateable are also updateable by the system.
  2. High-level insert, update, and delete. The capability of handling a base relation or a derived relation as a single operand applies not only to the retrieval of data, but also to the insertion, update, and deletion of data.
  3. Physical data independence. Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representations or access methods.
  4. Logical data independence. Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind that theoretically permit unimpairment are made to the base tables.
  5. Integrity independence. Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.
  6. Distribution independence. A relational DBMS has distribution independence.
  7. Nonsubversion rule. If a relational system has a low-level (single record at a time) language, that low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level relational language (multiple records at a time).

During the early 1990s, it became popular practice to compile “scorecards” for commercial DBMS products, showing how well they satisfy each of the rules. Unfortunately, the rules are subjective, so the scorecards were usually full of footnotes and qualifications, and they didn’t reveal a great deal about the products. Today, the basis of competition for database vendors tends to revolve around performance, features, the availability of development tools, the quality of vendor support, the availability of application programs that support the particular database system, and other issues, rather than conformance to Codd’s rules. Nonetheless, they are an important part of the history of the relational model.

Rule 1 is basically the informal definition of a relational database presented at the beginning of this section.

Rule 2 stresses the importance of primary keys for locating data in the database. The table name locates the correct table, the column name finds the correct column, and the primary key value finds the row containing an individual data item of interest. Rule 3 requires support for missing data through NULL values, which are described in Chapter 5.

Rule 4 requires that a relational database be self-describing. In other words, the database must contain certain system tables whose columns describe the structure of the database itself. These tables are described in Chapter 16.

Rule 5 mandates using a relational database language, such as SQL, although SQL is not specifically required. The language must be able to support all the central functions of a DBMS—creating a database, retrieving and entering data, implementing database security, and so on.

Rule 6 deals with views, which are virtual tables used to give various users of a database different views of its structure. It is one of the most challenging rules to implement in practice, and no commercial product fully satisfies it today. Views and the problems of updating them are described in Chapter 14.

Rule 7 stresses the set-oriented nature of a relational database. It requires that rows be treated as sets in insert, delete, and update operations. The rule is designed to prohibit implementations that support only row-at-a-time, navigational modification of the database.

Rule 8 and Rule 9 insulate the user or application program from the low-level implementation of the database. They specify that specific access or storage techniques used by the DBMS, and even changes to the structure of the tables in the database, should not affect the user’s ability to work with the data.

Rule 10 says that the database language should support integrity constraints that restrict the data that can be entered into the database and the database modifications that can be made. This is another rule that is not supported in most commercial DBMS products.

Rule 11 says that the database language must be able to manipulate distributed data located on other computer systems. Distributed data and the challenges of managing it are described in Chapter 23.

Finally, Rule 12 prevents “other paths” into the database that might subvert its relational structure and integrity.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *