SQL and Objects: Abstract (Structured) Data Types

The data types envisioned by the relational data model are simple, indivisible, atomic data values. If a data item such as an address is actually composed of a street address, city, state, and postal code, as a database designer, you have two choices. You can treat the address as four separate data items, each stored in its own column, so that you can search and retrieve the items individually. Or you can treat the address as a single unit, in which case, you cannot process its individual component parts within the database. There is no middle ground that allows you to treat the address as a unit for certain situations and access its component parts for others.

Many programming languages (including even nonobject-oriented languages like C or Pascal) do provide such a middle ground. They support compound data types or named data structures. The data structure is composed of individual data items or lower-level structures, which can be accessed individually. But the entire data structure can also be treated as a single unit when that is most convenient. Structured or composite data types in object-relational databases provide this same capability in a DBMS context.

Informix Universal Server supports abstract data types through its concept of row data types. You can think of a row type as a structured sequence of individual data items, called fields. Here is an Informix CREATE TABLE statement for a simple PERSONNEL table that uses a row data type to store both name and address information:

CREATE TABLE PERSONNEL (
    EMPL_NUM INTEGER,
        NAME ROW(
      F_NAME VARCHAR(15),
      M_INIT CHAR(1),
      L_NAME VARCHAR(20))
     ADDRESS ROW(
      STREET VARCHAR(35),
        CITY VARCHAR(15),
       STATE CHAR(2),
    POSTCODE ROW(
        MAIN INTEGER,
         SFX INTEGER)));

This table has three columns. The first one, EMPL_NUM, has an integer data type. The last two, NAME and ADDR, have a row data type, indicated by the keyword ROW, followed by a parenthesized list of the fields that make up the row. The NAME column’s row data type has three fields within it. The ADDRESS column’s row data type has four fields. The last of these four fields itself has a row data type and consists of two fields. In this simple example, the hierarchy is only two levels deep, but the capability can be (and often is) extended to additional levels.

Individual fields within the columns of the table are accessible in SQL statements through an extension of the SQL dot notation that is already used to qualify column names with table names and user names. Adding a dot after a column name allows you to specify the names of individual fields within a column. This SELECT statement retrieves the employee numbers and first and last names of all personnel with a specified main postal code:

 SELECT EMPL_NUM, NAME.F_NAME, NAME.L_NAME

   FROM PERSONNEL

  WHERE ADDRESS.POSTCODE.MAIN = ‘12345’;

Suppose another table within the database, named MANAGERS, had the same NAME structure as one of its columns. Then this query retrieves the employee numbers of employees who are also managers:

SELECT EMPL_NUM

  FROM PERSONNEL, MANAGERS

 WHERE PERSONNEL.NAME = MANAGERS.NAME;

In the first of these two queries, it makes sense to retrieve the individual fields within the NAME column. The second query shows a situation where it’s more convenient to use the entire name column (all three fields) as the basis for comparison. It’s clearly a lot more convenient to ask the DBMS to compare the two abstract data typed columns than it is to specify separate comparisons for each of the individual fields. Together, these examples show the advantages of the row data type in allowing access to the fields at any level of the hierarchy.

The row data type columns require special handling when inserting data into the database. The PERSONNEL table has three columns, so an INSERT statement for the table must have three items in its VALUES clause. The columns that have a row data type require a special ROW value-constructor to put together the individual data items into a row-type item that matches the data type of the column. Here is a valid INSERT statement for the table that illustrates the use of the ROW constructor:

 INSERT INTO PERSONNEL

VALUES (1234,

ROW(‘John’, ‘J’, ‘Jones’),

ROW(‘197 Rose St.’, ‘Chicago’, ‘IL’,

ROW(12 34 5, 6789)));

1. Defining Abstract Data Types

With the Informix row data type capabilities illustrated so far, each individual structured column is defined in isolation. If two tables need to use the same row data type structure, it is defined within each table. This violates one of the key principles of object-oriented design, which is reusability. Instead of having each object (the two columns in the two different tables) have its own definition, the row data type should be defined once and then reused for the two columns. Informix Universal Server provides this capability through its named row type feature. (The row data types shown in previous examples are unnamed row data types.)
You create an Informix named row type with the CREATE ROW TYPE statement. Here are examples for the PERSONNEL table:

CREATE ROW TYPE NAME_TYPE (

F_NAME VARCHAR(15),

M_INIT CHAR(1),

L_NAME VARCHAR(20));

CREATE ROW TYPE POST_TYPE (

MAIN INTEGER,

SFX INTEGER);

CREATE ROW TYPE ADDR_TYPE (

STREET VARCHAR(35),

CITY VARCHAR(15),

STATE CHAR(2),

POSTCODE POST_TYPE);

Note that the definition of a named row type can depend on other, previously created named row types, as shown by the ADDR_TYPE and POST_TYPE definitions. With these row data types defined, the name and address columns in the PERSONNEL table (and any other columns holding name or address data in other tables of the database) can be defined using it. The aggressive use of abstract data types can thus help to enforce uniformity in naming and data typing within an object-relational database. Here is the new Informix definition of the PERSONNEL table, using the just-defined abstract data types:

CREATE TABLE PERSONNEL (

EMPL_NUM INTEGER,
NAME NAME_TYPE,
ADDRESS ADDR_TYPE);

Figure 24-1 shows some sample data for this table and the hierarchical column/ field structure created by the abstract data types.

Oracle supports abstract data types through a very similar structure, with slightly different SQL syntax. Here is the Oracle CREATE TYPE statement to create the same abstract data structure for names and addresses:

CREATE TYPE NAME_TYPE AS OBJECT (

F_NAME VARCHAR(15),
M_INIT CHAR(1),
L_NAME VARCHAR(20));

CREATE TYPE POST_TYPE AS OBJECT (

MAIN INTEGER,
SFX INTEGER);

CREATE TYPE ADDR_TYPE AS OBJECT (

STREET VARCHAR(35),
CITY VARCHAR(15),
STATE CHAR(2),
POSTCODE POST_TYPE);

Oracle calls the abstract data type an object instead of a row type. In fact, the type is functioning as an object class in the usual object-oriented terminology. Extending the object-oriented terminology further, the individual components of an Oracle abstract data type are referred to as attributes (corresponding to the Informix fields described earlier). The ADDR_TYPE type has four attributes in this example. The fourth attribute, POSTCODE, is itself an abstract data type.

Both Oracle and Informix use the extended dot notation to refer to individual data elements within abstract data types. With nested abstract types, it takes several levels of dot-delimited names to identify an individual data item. The main postal code within the PERSONNEL table is identified as:

PERSONNEL.ADDRESS.POSTCODE.MAIN

If the table were owned by another user, Sam, the qualified name becomes even longer:

SAM.PERSONNEL.ADDRESS.POSTCODE.MAIN

Informix allows the use of row types to go one step beyond their role as data type templates for individual columns. You can use a row type to define the structure of an entire table. For example, with this row type definition:

 CREATE ROW TYPE PERS_TYPE (

EMPL_NUM INTEGER,

NAME NAME_TYPE,

ADDRESS ADDR_TYPE)

you can define the PERSONNEL table using the row type as a model:

 CREATE TABLE PERSONNEL

OF TYPE PERS_TYPE;

The columns of this PERSONNEL table will be exactly as they were in the previous CREATE TABLE examples, but now PERSONNEL is a typed table. The most basic use of the typed table capability is to formalize the object structure in the database. Each object class has its own row type, and the typed table that holds objects (rows) of that class is defined in terms of the row type. Beyond this usage, typed tables are also a key component of the Informix notion of table inheritance, described later in the “Inheritance” section.

2. Manipulating Abstract Data Types

Unfortunately, structured data types create new complexity for database update statements that must insert or modify their structured data values. Informix Universal Server is fairly liberal in its data type conversion requirements for unnamed row types. The data you assign into a row-type column must simply have the same number of fields, of the same data types. The ROW constructor is used, as shown in previous examples, to assemble individual data items into a row-type value for inserting or updating data.

For named row types, the requirement is more stringent; the data you assign into a named row-type column must actually have the same named row type. You can achieve this in the INSERT statement by explicitly casting the constructed row value to have the NAME_TYPE data type:

INSERT INTO PERSONNEL

VALUES (1234,

ROW(‘John’, ‘J’, ‘Jones’)::NAME_TYPE,

ROW(‘197 Rose St.’, ‘Chicago’, ‘IL’,

ROW(12 34 5, 6789)));

The double-colon operator casts the constructed three-field row as a NAME_TYPE row and makes the VALUES clause compatible with the data types of the columns in the table.

Oracle uses a slightly different approach to constructing structured data items and inserting them into columns that have abstract data types. When you create an Oracle abstract data type (using the CREATE TYPE statement), Oracle automatically defines a constructor method for the type. You can think of the constructor method as a function that takes as its arguments the individual components of the abstract data type and returns an abstract data type value, with the individual components all packaged together. The constructor is used in the VALUES clause of the INSERT statement to glue the individual data item values together into a structured data value that matches the column definition. Here is an INSERT statement for the PERSONNEL table:

INSERT INTO PERSONNEL

VALUES (1234,

NAME_TYPE(‘John’, ‘J’, ‘Jones’),

ADDR_TYPE(‘197 Rose St.’, ‘Chicago’, ‘IL’,

POST_TYPE(12345, 6789)));

The constructors (NAME_TYPE, ADDR_TYPE, POST_TYPE) perform the same functions as the ROW constructor does for Informix, and also provide the casting required to ensure strict data type correspondence.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *