The Proliferation of SQL

As the standard for relational database access, SQL has had a major impact on all parts of the computer market. IBM has adopted SQL as a unifying database technology for its product line. SQL-based databases dominate the market for UNIX-based computer systems. In the PC market, SQL databases on server-oriented Windows operating systems are mounting a serious challenge to the dominance of UNIX as a database processing platform, especially for departmental applications. SQL is accepted as a technology for online transaction processing (OLTP), fully refuting the conventional wisdom of the 1980s that relational databases would never offer performance good enough for transaction processing applications. SQL-based data warehousing and data mining applications are the standard for helping companies to discover customer purchase patterns and offer better products and services. On the Internet, SQL-based databases are the foundation of more personalized products, services, and information services that are a key benefit of electronic commerce.

1. SQL and IBM’s Unified Database Strategy

SQL has played a key role as a common database access language across all of IBM’s computer families. Originally, this role was part of IBM’s SAA strategy, announced in March 1987. Although IBM’s grand goals for SAA were not achieved, the unifying role of SQL has grown even more important over time. The DB2 database system, IBM’s flagship SQL-based DBMS, now runs on a broad range of IBM and non-IBM computer systems, including:

  • Mainframes. DB2 started as the SQL standard-bearer for IBM mainframes running MVS and has now replaced SQL/DS as the relational system for the VM and VSE mainframe operating systems.
  • AS/400. This SQL implementation runs on IBM’s family of midrange business systems, targeted at small- and medium-sized businesses and server applications.
  • Power-architecture servers. DB2 runs under the UNIX operating system on IBM’s family of RISC-based workstations and servers, for engineering and scientific applications, and as IBM’s own UNIX database server platform.
  • Other UNIX platforms. IBM supports DB2 on UNIX-based server platforms from Sun Microsystems and Hewlett-Packard, the two largest UNIX system vendors, and on UNIX-based workstations from Silicon Graphics.
  • Windows. A PC-LAN server version of DB2 competes with Microsoft SQL Server, Oracle, and others on Windows-based database servers.

2. SQL on Minicomputers

Minicomputers were one of the most fertile early markets for SQL-based database systems. Oracle and Ingres were both originally marketed on Digital’s VAX/VMS minicomputer systems. Both products have since been ported to many other platforms. Sybase, a later database system specialized for online transaction processing, also targeted the VAX as one of its primary platforms.

Through the 1980s, the minicomputer vendors also developed their own proprietary relational databases featuring SQL. Digital considered relational databases so important that it bundled a runtime version of its Rdb/VMS database with every VAX/VMS system. Hewlett-Packard offered Allbase, a database that supported both its HPSQL dialect and a nonrelational interface. Data General’s DG/SQL database replaced its older nonrelational databases as DG’s strategic data management tool. In addition, many of the minicomputer vendors resold relational databases from the independent database software vendors. These efforts helped to establish SQL as an important technology for midrange computer systems.

By the mid-1990s, the minicomputer vendors’ SQL products had largely disappeared, beaten in the marketplace by multiplatform software from Oracle, Informix, Sybase, and others. Accompanying this trend, the importance of proprietary minicomputer operating systems has faded as well, replaced by widespread use of UNIX on midrange systems. Yesterday’s minicomputer SQL market has effectively become today’s market for UNIX-based database servers based on SQL.

3. SQL on UNIX-Based Systems

SQL has firmly established itself as the data management solution of choice for UNIX-based computer systems. Originally developed at Bell Laboratories, UNIX became very popular in the 1980s as a vendor-independent, standard operating system. It runs on a wide range of computer systems, from workstations to mainframes, and has become the standard operating system for high-end server systems, including database servers.

In the early 1980s, four major databases were already available for UNIX systems. Two of them, Ingres and Oracle, were UNIX versions of the products that ran on DEC’s proprietary minicomputers. The other two, Informix and Unify, were written specifically for UNIX. Neither of them originally offered SQL support, but by 1985, Unify offered a SQL query language, and Informix had been rewritten as Informix-SQL, with full SQL support.

Today, the Oracle, DB2, Informix, and Sybase DBMS products dominate the UNIX- based database market and are available on all of the leading UNIX server platforms. UNIX-based database servers are a mainstream building block for both client/server and three-tier Internet architectures. The constant search for higher SQL database performance has driven some of the most important trends in UNIX system hardware. These include the emergence of symmetric multiprocessing (SMP) as a mainstream server architecture, and the use of RAID (Redundant Array of Independent Disks) technology to boost I/O performance.

4. SQL on Personal Computers

Databases have been popular on personal computers since the early days of the IBM PC. Ashton-Tate’s dBASE product reached an installed base of over one million

MS-DOS-based PCs. Although these early PC databases often presented data in tabular form, they lacked the full power of a relational DBMS and a relational database language such as SQL. The first SQL-based PC databases were versions of popular minicomputer products that barely fit on personal computers. For example, Professional Oracle for the IBM PC, introduced in 1984, required two megabytes of memory—well above the typical 640KB PC configuration of the day.

The real impact of SQL on personal computers began with the announcement of OS/2 by IBM and Microsoft in April 1987. In addition to the standard OS/2 product, IBM announced a proprietary OS/2 Extended Edition (OS/2 EE) with a built-in SQL database and communications support. With the introduction, IBM again signaled its strong commitment to SQL, saying in effect that SQL was so important that it belonged in the computer’s operating system.

OS/2 Extended Edition presented Microsoft with a problem. As the developer and distributor of standard OS/2 to other personal computer manufacturers, Microsoft needed an alternative to the Extended Edition. Microsoft responded by licensing the Sybase DBMS, which had been developed for VAX, and began porting it to OS/2. In January 1988, in a surprise move, Microsoft and Ashton-Tate (the PC database leader at the time with its dBASE product) announced that they would jointly sell the resulting OS/2-based product, renamed SQL Server. Microsoft would sell SQL Server with OS/2 to computer manufacturers; Ashton-Tate would sell the product through retail channels to PC users. In September 1989, Lotus Development (the other member of the big three of PC software at the time) added its endorsement of SQL Server by investing in Sybase. Later that year, Ashton-Tate relinquished its exclusive retail distribution rights and sold its investment to Lotus.

SQL Server for OS/2 met with only limited success. But in typical Microsoft fashion, Microsoft continued to invest heavily in SQL Server development and ported it to its Windows NT operating system. For a while, Microsoft and Sybase remained partners, with Sybase focused on the minicomputer and UNIX-based server markets and Microsoft focused on PC LANs and Windows NT. As Windows NT and UNIX systems became more and more competitive as database server operating system platforms, the relationship became less cooperative and more competitive. Eventually, Sybase and Microsoft went their separate ways. The common heritage of Sybase’s and Microsoft’s SQL products can still be seen in product capabilities and some common SQL extensions (for example, stored procedures), but the product lines have already diverged significantly.

Today, SQL Server is a major database system on Windows-based servers. SQL Server 7, which shipped in late 1998, provided a significant step up in the size and scale of database applications that SQL Server can support. SQL Server 2000, which runs on Windows 2000, provided another major step. SQL Server is slated to continue in a major role as Microsoft rolls out more of its .NET server product family. In addition to SQL Server’s impact, the availability of Oracle, and to a lesser extent, Informix, DB2, and other mainstream DBMS products, has helped Windows-based servers to steadily make inroads into UNIX’s dominance as a database server platform. While UNIX continues to dominate the largest database server installations, server configurations
of the Windows operating system and the Intel architecture systems on which it runs have achieved credibility in the midrange market.

5. SQL and Transaction Processing

SQL and relational databases originally had very little impact in online transaction processing (OLTP) applications. With their emphasis on queries, relational databases were confined to decision support and low-volume online applications, where their slower performance was not a disadvantage. For OLTP applications, where hundreds of users needed online access to data and subsecond response times, IBM’s nonrelational Information Management System (IMS) reigned as the dominant DBMS.

In 1986, a new DBMS vendor, Sybase, introduced a new SQL-based database especially designed for OLTP applications. The Sybase DBMS ran on VAX/VMS minicomputers and Sun workstations, and focused on maximum online performance. Oracle Corporation and Relational Technology followed shortly with announcements that they, too, would offer OLTP versions of their popular Oracle and Ingres database systems. In the UNIX market, Informix announced an OLTP version of its DBMS, named Informix-Turbo.

In 1988, IBM jumped on the relational OLTP bandwagon with DB2 Version 2, with benchmarks showing the new version operating at over 250 transactions per second on large mainframes. IBM claimed that DB2 performance was now suitable for all but the most demanding OLTP applications, and encouraged customers to consider it as a serious alternative to IMS. OLTP benchmarks have now become a standard sales tool for relational databases, despite serious questions about how well the benchmarks actually measure performance in real applications.

The suitability of SQL for OLTP improved dramatically through the 1990s, with advances in relational technology and more powerful computer hardware both leading to ever-higher transaction rates. DBMS vendors started to position their products based on their OLTP performance, and for a few years database advertising focused almost entirely on these performance benchmark wars. A vendor-independent organization, the Transaction Processing Council, jumped into the benchmarking fray with a series of vendor-independent benchmarks (TPC-A, TPC-B, and TPC-C), which only served to intensify the performance focus of the vendors.

By the late 1990s, SQL-based relational databases on high-end UNIX-based database servers evolved well past the 1000-transactions-per-second mark. Client/server systems using SQL databases have become the accepted architecture for implementing OLTP applications. From a position as “unsuitable for OLTP,” SQL has grown to be the industry standard foundation for building OLTP applications.

6. SQL and Workgroup Databases

The dramatic growth of PC LANs through the 1980s and 1990s created a new opportunity for departmental or workgroup database management. The original database systems focused on this market segment ran on IBM’s OS/2 operating system. In fact, SQL Server, now a key part of Microsoft’s Windows strategy, originally made its debut as
an OS/2 database product. In the mid-1990s, Novell also made a concentrated effort to make its NetWare operating system an attractive workgroup database server platform. From the earliest days of PC LANs, NetWare had become established as the dominant network operating system for file and print servers. Through deals with Oracle and others, Novell sought to extend this leadership to workgroup database servers as well.

The arrival of Windows NT on the workgroup computing scene was the catalyst that caused the workgroup database market to really take off. While NetWare offered a clear performance advantage over NT as a workgroup file server, NT had a more robust, general-purpose architecture, more like the minicomputer operating systems. Microsoft successfully positioned NT as a more attractive platform for running workgroup applications (as an application server) and workgroup databases. Microsoft’s own SQL Server product was marketed (and often bundled) with NT as a tightly integrated workgroup database platform. Corporate information systems departments were at first very cautious about using relatively new and unproven technology, but the NT / SQL Server combination allowed departments and non-IS executives to undertake smaller- scale, workgroup-level projects on their own, without corporate IS help. This phenomenon, like the grass roots support for personal computers a decade earlier, fueled the early growth of the workgroup database segment.

Today, SQL is well established as a workgroup database standard. Microsoft’s SQL Server has been joined by Oracle, Informix, Sybase, DB2, and many other DBMS brands running on the Windows server platforms. Windows-based SQL databases are the second largest segment of the DBMS market and are the fastest growing. From this solid dominance in the workgroup segment, Windows-based server systems are mounting a continued assault on enterprise-class database applications, slowly but surely eating into low-end UNIX-based database deployments.

7. SQL and Data Warehousing

For several years, the effort to make SQL a viable technology for OLTP applications shifted the focus away from the original relational database strengths of query processing and decision making. Performance benchmarks and competition among the major DBMS brands focused on simple transactions like adding a new order to the database or determining a customer’s account balance. Because of the power of the relational database model, the databases that companies used to handle daily business operations could also be used to analyze the growing amounts of data that were being accumulated. A frequent theme of conferences and trade show speeches for IS managers was that a corporation’s accumulated data (stored in SQL databases, of course) should be treated as a valuable asset and used to help improve the quality of business decision making.

Although relational databases could, in theory, easily perform both OLTP and decision-making applications, there were some very significant practical problems. OLTP workloads consisted of many short database transactions, and the response time for users was very important. In contrast, decision-support queries could involve sequential scans of large database tables to answer questions like “What is the average order size by sales region?” or “How do inventory trends compare with the same time a year ago?” These queries could take minutes or hours. If a business analyst tried to run one of these queries during a time when business transaction volumes reached their peak, it could cause serious degradation in OLTP performance. Another problem was that the data to answer useful questions about business trends was often spread across many different databases, typically involving different DBMS vendors and different computer platforms.

The desire to take advantage of accumulated business data, and the practical performance problems it caused for OLTP applications, led to the concept of a data warehouse, shown in Figure 3-6. Business data is extracted from OLTP systems, reformatted and validated as necessary, and then placed into a separate database that is dedicated to decision-making queries (the “warehouse”). The data extraction and transformation can be scheduled for off-hours batch processing. Ideally, only new or changed data can be extracted, minimizing the amount of data to be processed in the monthly, weekly, or daily warehouse refresh cycle. With this scheme, the time­consuming business analysis queries use the data warehouse, not the OLTP database, as their source of data.

SQL-based relational databases were a clear choice for the warehouse data store because of their flexible query processing. A series of new companies was formed to build the data extraction, transformation, and database query tools needed by the data warehouse model. In addition, DBMS vendors started to focus on the kinds of database queries that customers tended to run in the data warehouse. These queries tended to be large and complex—such as analyzing tens or hundreds of millions of individual cash-register receipts to look for product purchase patterns. They often involved time-series data—for example, analyzing product sales or market share data over time. They also tended to involve statistical summaries of data—total sales, average order volume, percent growth, and so on—rather than the individual data items themselves.

To address the specialized needs of data warehousing applications (often called OnLine Analytical Processing or OLAP), specialized databases began to appear. These databases were optimized for OLAP workloads in several different ways. Their performance was tuned for complex, read-only query access. They supported advanced statistical and other data functions, such as built-in time-series processing. They supported precalculation of database statistical data, so that retrieving averages and totals could be dramatically faster. Some of these specialized databases did not use SQL, but many did (leading to the companion term ROLAP, for Relational OnLine Analytic Processing). As with so many segments of the database market, SQL’s advantages as a standard proved to be a powerful force. Data warehousing has become a one-billion-dollar-plus segment of the database market, and SQL-based databases are firmly entrenched as the mainstream technology for building data warehouses.

8. SQL and Distributed Internet Applications

During the late 1990s, the World Wide Web and the web browsing capability that it enabled were the driving force behind the growth of the Internet. With its focus on delivering content in the form of text and graphics, the early uses of the Web had little to do with data management. By the mid-1990s, however, much of the content delivered from corporate web sites had its origins in SQL-based corporate databases. For example, on a commercial web site for a retailer, web pages that contain information about products available for sale, their prices, product availability, special promotions, and the like are typically created on demand, based on data retrieved from a SQL database. The vast majority of the pages displayed by an online auction site or by an online travel site are similarly based on data retrieved from SQL databases, transformed into the Web’s HTML page format. In the other direction, data entered by a user into browser page forms is almost always captured into SQL databases that form part of the web site architecture.

By the early 2000s, industry attention had turned to the next phase of the Internet, and the role that Internet technologies can play in connecting computer applications to one another. These distributed applications architectures received widespread trade press coverage under the banner of web services. In the long-standing tradition of the computer industry, competing camps emerged, championing different sets of standards and languages for implementing them—a Microsoft-led camp under the .NET framework, and a rival camp focused on Java and J2EE-based application servers. Both architectures embrace a key role for XML, a standard for exchanging structured data like the data that resides in SQL databases.

In response to the industry attention on web services, a flurry of products has been announced that link XML-formatted messages to SQL-based databases. Startup database vendors and some of the object database vendors have announced XML-based database products, arguing that they provide an ideal, native match for the XML-formatted exchange of data over the Internet. The established relational database players have responded with their own XML initiatives, adding XML input/output capabilities, and in some cases, XML data type support, to their products. The interaction between XML and SQL is one of the most active areas in data management today, and the activity in this area will keep SQL in the industry spotlight well into the first decade of the twenty- first century.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *