One of the most important developments in the market acceptance of SQL is the emergence of SQL standards. References to “the SQL standard” usually mean the official standard adopted by the American National Standards Institute (ANSI) and the International Standards Organization (ISO). However, there are other important SQL standards, including the de facto standard for some parts of the SQL language that have been defined by IBM’s DB2 product family, and Oracle’s SQL dialect, which has a dominant installed- base market share.
1. The ANSI/ISO Standards
Work on the official SQL standard began in 1982, when ANSI charged its X3H2 committee with defining a standard relational database language. At first, the committee debated the merits of various proposed database languages. However, as IBM’s commitment to SQL increased and SQL emerged as a de facto standard in the market, the committee selected SQL as their relational database language and turned their attention to standardizing it.
The resulting ANSI standard for SQL was largely based on DB2 SQL, although it contains some major differences from DB2. After several revisions, the standard was officially adopted as ANSI standard X3.135 in 1986, and as an ISO standard in 1987.
The ANSI/ISO standard has since been adopted as a Federal Information Processing Standard (FIPS) by the U.S. government. This standard, slightly revised and expanded in 1989, is usually called the SQL-89 or SQL1 standard.
Many of the ANSI and ISO standards committee members were representatives from database vendors who had existing SQL products, each implementing a slightly different SQL dialect. Like dialects of human languages, the SQL dialects were generally
very similar to one another but were incompatible in their details. In many areas, the committee simply sidestepped these differences by omitting some parts of the language from the standard and specifying others as “implementor-defined.” These decisions allowed existing SQL implementations to claim broad adherence to the resulting ANSI/ ISO standard but made the standard relatively weak.
To address the holes in the original standard, the ANSI committee continued its work, and drafts for a new, more rigorous SQL2 standard were circulated. Unlike the 1989 standard, the SQL2 drafts specified features considerably beyond those found in current commercial SQL products. Even more far-reaching changes were proposed for a follow-on SQL3 standard. In addition, the draft standards attempted to officially standardize parts of the SQL language where different proprietary standards had long since been set by the various major DBMS brands. As a result, the proposed SQL2 and SQL3 standards were a good deal more controversial than the initial SQL standard.
The SQL2 standard weaved its way through the ANSI approval process and was finally approved in October 1992. While the original 1986 standard took less than 100 pages, the SQL2 standard (officially called SQL-92) takes nearly 600 pages.
The SQL2 standards committee acknowledged the large step from SQL1 to SQL2 by explicitly creating three levels of SQL2 standards compliance. The lowest compliance level (Entry-Level) requires only minimal additional capability beyond the SQL-89 standard. The middle compliance level (Intermediate-Level) was created as an achievable major step beyond SQL-89, but one that avoids the most complex and most system- dependent and DBMS brand-dependent issues. The third compliance level (Full) requires a full implementation of all SQL2 capabilities. Throughout the 600 pages of the standard, each description of each feature includes a definition of the specific aspects of that feature that must be supported to achieve Entry, Intermediate, or Full compliance.
Despite the existence of a SQL2 standard for more than ten years, popular commercial SQL products do not, in practice, fully implement the SQL2 specification, and no two commercial SQL products support exactly the same SQL dialect. Moreover, as database vendors introduce new capabilities, they continually expand their SQL dialects and move them slightly further apart. The central core of the SQL language has become fairly standardized, however. Where it could be done without hurting existing customers or features, vendors have brought their products into conformance with the SQL-89 standard, and with the most useful capabilities of the SQL2 standard.
In the meantime, work continues on standards beyond SQL2. The SQL3 effort effectively fragmented into separate standardization efforts and focused on different extensions to SQL. Some of these, such as stored procedure capabilities, are already found in many commercial SQL products and pose the same standardization challenges faced by SQL2. Others, such as proposed object extensions to SQL, are not yet widely available or fully implemented, but have generated a great deal of controversy. With most vendors only recently implementing major SQL2 capabilities, and with the diversity of SQL extensions now available in commercial products, work on SQL3 has taken on less commercial importance.
The “real” SQL standard, of course, is the SQL implemented in products that are broadly accepted by the marketplace. For the most part, programmers and users tend to stick with those parts of the language that are fairly similar across a broad range of products. The innovation of the database vendors continues to drive the invention of new SQL capabilities; some products remain years later only for backward compatibility, and some find commercial success and move into the mainstream.
2. Other SQL Standards
Although it is the most widely recognized, the ANSI/ISO standard is not the only standard for SQL. X/OPEN, a European vendor group, also adopted SQL as part of its suite of standards for a portable application environment based on UNIX. The X/OPEN standards have played a major role in the European computer market, where portability among computer systems from different vendors is a key concern. Unfortunately, the X/OPEN standard differs from the ANSI/ISO standard in several areas.
IBM also included SQL in the specification of its bold 1990s Systems Application Architecture (SAA) blueprint, promising that all of its SQL products would eventually move to this SAA SQL dialect. Although SAA failed to achieve its promise of unifying the IBM product line, the momentum toward a unified IBM SQL continued. With its mainframe DB2 database as the flagship, IBM introduced DB2 implementations for OS/2, its personal computer operating system, and for its RS/6000 line of UNIX-based workstations and servers.
By 1997, IBM had moved DB2 beyond its own product line and shipped versions of DB2-Universal Database for systems made by rival manufacturers Sun Microsystems, Hewlett-Packard, and Silicon Graphics, and for Windows NT. IBM further shored up its database software position on non-IBM hardware platforms with its 2001 acquisition of the Informix database. With IBM’s historical leadership in relational database technology, the SQL dialect supported by DB2 is a very powerful de facto standard.
3. ODBC and the SQL Access Group
An important area of database technology not addressed by official standards is database interoperability—the methods by which data can be exchanged among different databases, usually over a network. In 1989, a group of vendors formed the SQL Access Group to address this problem. The resulting SQL Access Group specification for Remote Database Access (RDA) was published in 1991. Unfortunately, the RDA specification was closely tied to the OSI protocols, which were never widely implemented, so it had little impact. Transparent interoperability among different vendors’ databases remains an elusive goal.
A second standard from the SQL Access Group had far more market impact. At Microsoft’s urging and insistence, the SQL Access Group expanded its focus to include a call-level interface for SQL. Based on a draft from Microsoft, the resulting Call-Level Interface (CLI) specification was published in 1992. Microsoft’s own Open Database Connectivity (ODBC) specification, based on the CLI standard, was published the same year. With the market power of Microsoft behind it, and the “open standards” blessing of the SQL Access Group, ODBC has emerged as the de facto standard interface for PC access to SQL databases. Apple and Microsoft announced an agreement to support ODBC on Macintosh and Windows in the spring of 1993, giving ODBC industry standard status in both popular graphical user interface environments. ODBC implementations for UNIX-based systems soon followed. In 1995, the ODBC interface effectively became an ANSI/ISO standard, with the publication of the SQL/Call-Level Interface (CLI) standard.
Today, ODBC is in its fourth major revision as a cross-platform database access standard. ODBC support is available for all major DBMS brands. Most packaged application programs that have database access as an important part of their capabilities support ODBC, and they range from multimillion-dollar enterprise-class applications like Enterprise Resource Planning (ERP) and Supply Chain Management (SCM) to PC applications such as spreadsheets, query tools, and reporting programs. Microsoft’s focus has moved beyond ODBC to higher-level interfaces (such as OLE/DB) and more recently to Active/X Data Objects (ADO), but these new interfaces are layered on top of ODBC for relational database access, and it remains a key cross-platform database access technology.
4. SQL and Portability
The existence of published SQL standards has spawned quite a few exaggerated claims about SQL and applications portability. Diagrams such as the one in Figure 3-1 are frequently drawn to show how an application using SQL can work interchangeably with any SQL-based database management system. In fact, the holes in the SQL-89 standard and the current differences between SQL dialects are significant enough that an application must always be modified when moved from one SQL database to another. These differences, many of which were eliminated by the SQL2 standard but have not yet been implemented in commercial products, include:
- Error codes. The SQL -89 standard does not specify the error codes to be returned when SQL detects an error, and all of the commercial implementations use their own set of error codes. The SQL2 standard specifies standard error codes.
- Data types. The SQL-89 standard defines a minimal set of data types, but it omits some of the most popular and useful types, such as variable-length character strings, dates and times, and money data. The SQL2 standard addresses these, but not “new” data types such as graphics and multimedia objects.
- System tables. The SQL-89 standard is silent about the system tables that provide information regarding the structure of the database itself. Each vendor has its own structure for these tables, and even IBM’s four SQL implementations differ from one another. The tables are standardized in SQL2, but only at the higher levels of compliance, which are not yet provided by most vendors.
- Interactive SQL. The standard specifies only the programmatic SQL used by an application program, not interactive SQL. For example, the SELECT statement used to query the database in interactive SQL is absent from the SQL-89 standard. Again, the SQL2 standard addressed this issue, but long after all of the major DBMS vendors had well-established interactive SQL capabilities.
- Programmatic interface. The original standard specifies an abstract technique for using SQL from within an applications program written in COBOL, C, FORTRAN, and other programming languages. No commercial SQL product uses this technique, and there is considerable variation in the actual programmatic interfaces used. The SQL2 standard specifies an embedded SQL interface for popular programming languages but not a call-level interface. The 1995 SQL/CLI standard finally addressed programmatic SQL access, but not before commercial DBMS products had popularized proprietary interfaces and deeply embedded them in hundreds of thousands of user applications and application packages.
- Dynamic SQL. The SQL-89 standard does not include the features required to develop general-purpose database front-ends, such as query tools and report writers. These features, known as dynamic SQL, are found in virtually all SQL database systems, but they vary significantly from product to product. SQL2 includes a standard for dynamic SQL. But with hundreds of thousands of existing applications dependent on backward compatibility, DBMS vendors have not implemented it.
- Semantic differences. Because the standards specify certain details as implementor-defined, it’s possible to run the same query against two different conforming SQL implementations and produce two different sets of query results. These differences occur in the handling of NULL values, column functions, and duplicate row elimination.
- Collating sequences. The SQL-89 standard does not address the collating (sorting) sequence of characters stored in the database. The results of a sorted query will be different if the query is run on a personal computer (with ASCII characters) and a mainframe (with EBCDIC characters). The SQL2 standard includes an elaborate specification for how a program or a user can request a specific collating sequence, but it is an advanced-level feature that is not typically supported in commercial products.
- Database structure. The SQL-89 standard specifies the SQL language to be used once a particular database has been opened and is ready for processing. The details of database naming and how the initial connection to the database is established vary widely and are not portable. The SQL2 standard creates more uniformity but cannot completely mask these details.
Despite these differences, commercial database tools boasting portability across several different brands of SQL databases began to emerge in the early 1990s. In every case, however, the tools require a special adapter for each supported DBMS, which generates the appropriate SQL dialect, handles data type conversion, translates error codes, and so on. Transparent portability across different DBMS brands based on standard SQL is the major goal of SQL2 and ODBC, and significant progress has been made. Today, virtually all programs that support multiple databases include specific drivers for communicating with each of the major DBMS brands, and usually include an ODBC driver for accessing the others.
Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.