Database Market Trends

Today’s market for database management products exceeds $12 billion per year in products and services revenues, up from about $5 billion per year a decade ago. On several occasions over the last decade, lower year-over-year growth in the quarterly revenues of the major database vendors has led analysts to talk about a maturing database market. Each time, a wave of new products or new data management applications has returned the market to double-digit growth. Client/server architecture, ERP applications, data warehousing and business intelligence, three-tier web site architectures—each of these spurred a new wave of database technology and a new wave of SQL-based database deployments. If the history of the last two decades is any indication, database technology will continue to find new applications and generate increasing revenues for years to come. The trends shaping the market bode well for its continued health and point to a continuing tension between market maturity and consolidation on the one hand, and exciting new database capabilities and applications on the other.

1. Enterprise Database Market Maturity

Relational database technology has become accepted as a core enterprise data processing technology, and relational databases have been deployed by virtually all large corporations. Because of the importance of corporate databases and years of experience in using relational technology, many, if not most, large corporations have selected a single DBMS brand as an enterprisewide database standard. Once such a standard has been established and widely deployed within a company, there is strong resistance to switching brands. Even though an alternative DBMS product may offer advantages for a particular application or may pioneer a new, useful feature, an announcement by the standard vendor that such features are planned for a future release will often forestall the loss of a customer by the established vendor.

The trend to corporate database standards has tended to reinforce and strengthen the market positions of the established major DBMS vendors. The existence of large direct sales forces, established customer support relationships, and multiyear volume purchase agreements has become as important as, or more important than, technology advantage. With this market dynamic, the large established players tend to concentrate on growing their business within their existing installed base instead of attempting to take customers away from competitors. In the late 1990s, industry analysts saw and predicted this tendency at both Informix and Sybase. Oracle, with a much larger share of the market, was forced to aggressively compete for new accounts in its attempt to maintain its database license revenue growth. Microsoft, as the upstart in the enterprise database market, was cast in the role of challenger, attempting to leverage its position in workgroup databases into enterprise-level prototypes and pilot projects as a way to pry enterprise business away from the established players.

One important impact of the trend to corporate DBMS vendor standardization has been a consolidation in the database industry. New startup database vendors tend to pioneer new database technology and grow by selling it to early adopters. These early adopters have helped to shape the technology and identified the solution areas where it can deliver real benefits. After a few years, when the advantages of the new technology have been demonstrated, the startup vendors are acquired by large established players. These vendors can bring the new technology into their installed base, and bring their marketing and sales muscle to bear in an attempt to win business in their competitor’s accounts. The early 1990s saw this cycle play out with database vendor acquisitions of database tools vendors. In the late 1990s, the same cycle applied to mergers and acquisitions of database vendors. Informix’s purchase of Illustra (a pioneering object-relational vendor), Red Brick (a pioneering data warehousing vendor), and Cloudscape (a pioneering pure Java database vendor) are three examples of the pattern. Just a few years later, Informix itself was acquired by IBM, continuing this particular chain of consolidation.

2. Market Diversity and Segmentation

Despite the maturing of some parts of the database market (especially the market for corporate enterprise-class database systems), it continues to develop new segments and niches that appear and then grow rapidly. For much of the 1990s, the most useful way to segment the database market has been based on database size and scale—there were PC databases, minicomputer databases, mainframe databases, and later, workgroup databases. Today’s database market is much more diverse and is more accurately segmented based on target application and specialized database capabilities to address unique application requirements. Market segments that have appeared and have experienced high growth include:

  • Data warehousing databases, focused on managing thousands of gigabytes of data, such as historical retail purchase data.
  • Online analytic processing (OLAP) and relational online analytic processing (ROLAP) databases, focused on carrying out complex analyses of data to discover underlying trends (data mining), allowing organizations to make better business decisions.
  • Mobile databases, in support of mobile workers such as salespeople, support personnel, field service people, consultants, and mobile professionals. Often, these mobile databases are tied back to a centralized database for synchronization.
  • Embedded databases, which are an integral, transparent part of an application sold by an independent software vendor (ISV) or a value-added reseller (VAR). These databases are characterized by small footprints and very simple administration.
  • Microdatabases, designed for appliance-type devices, such as smart cards, network computers, smart phones, and handheld PCs and organizers.
  • In-memory databases, designed for ultra-high-performance OLTP applications, such as those embedded in telecomm and data communications networks and used to support customer interaction in very high-volume Internet applications.
  • Clustered databases, designed to take advantage of powerful, low-cost servers used in parallel to perform database management tasks with high scalability and reliability.

3. Packaged Enterprise Applications

A decade or two ago, the vast majority of corporate applications were developed in-house by the company’s information systems department. Decisions about database technology and vendor standardization were part of the company’s IS architecture planning function. Leading-edge companies sometimes took a risk on new, relatively unproven database technologies in the belief that they could gain competitive advantage by using them. Sybase’s rise to prominence in the financial services sector during the late 1980s and early 1990s is an example.

Today, most corporations have shifted from make to buy strategies for major enterprisewide applications. Examples include ERP applications, SCM applications, HRM applications, SFA applications, CRM applications, and others. All of these areas are now supplied as enterprise-class packaged applications, along with consulting, customization, and installation services, by groups of software vendors. Several of these vendors have reached multihundred-million-dollar annual revenues. All of these packages are built on a foundation of SQL-based relational databases.

The emergence of dominant purchased enterprise applications has had a significant effect on the dynamics of the database market. The major enterprise software package vendors have tended to support DBMS products from only two or three of the major DBMS vendors. For example, if a customer chooses to deploy SAP as its enterprisewide ERP application, the underlying database is restricted to those supported by the SAP packages. This has tended to reinforce the dominant position of the current top-tier enterprise database players and make it more difficult for newer database vendors. It has also tended to lower average database prices, as the DBMS is viewed more as a component part of an application-driven decision rather than a strategic decision in its own right.

The emergence of packaged enterprise software has also shifted the relative power of corporate IS organizations and the packaged software vendors. The DBMS vendors today have marketing and business development teams focused on the major enterprise application vendors to ensure that the latest versions of the applications support their DBMS and to support performance tuning and other activities. The largest independent DBMS vendor, Oracle Corporation, is playing both roles, supplying both DBMS software and major enterprise applications (that run on the Oracle DBMS, of course). Oracle’s single-vendor approach has created some considerable tension between Oracle and the largest enterprise applications vendors, especially in the ranks of their field sales organizations. Some industry analysts attribute the growing DBMS market share of IBM
and Microsoft to a tendency for enterprise application vendors to steer prospective customers away from Oracle’s DBMS products as a result.

4. Hardware Performance Gains

One of the most important contributors to the rise of SQL has been a dramatic increase in the performance of relational databases. Part of this performance increase was due to advances in database technology and query optimization. However, most of the DBMS performance improvement came from gains in the raw processing power of the underlying computer systems, and changed in the DBMS software designed to capitalize on those gains. While the performance of mainframe systems steadily increased, the most dramatic performance gains have been in the UNIX-based and Windows-based server markets, where processing power has doubled or more year by year.

Some of the most dramatic advances in server performance come from the growth of symmetric multiprocessing (SMP) systems, where two, four, eight, or even dozens of processors operate in parallel, sharing the processing workload. A multiprocessor architecture can be applied to OLTP applications, where the workload consists of many small, parallel database transactions. Traditional OLTP vendors, such as Tandem, have always used a multiprocessor architecture, and the largest mainframe systems have used multiprocessor designs for more than a decade. In the 1990s, multiprocessor systems became a mainstream part of the UNIX-based server market, and somewhat later, an important factor at the high end of the PC server market.

With Intel’s introduction of multiprocessor chipsets, SMP systems featuring two-way and four-way multiprocessing achieved near-commodity status in the LAN server market, and were available for well under $10,000. In the midrange of the UNIX-based server market, database servers from Sun, Hewlett-Packard, and IBM routinely have 8 or 16 processors and sell in the hundred-thousand-dollar price range. High-end UNIX servers today can be configured with more than 100 processors and tens of gigabytes of main memory. These systems, which rival the computing power of traditional mainframes, carry multimillion-dollar price tags.

SMP systems also provided performance benefits in decision support and data analysis applications. As SMP servers became more common, the DBMS vendors invested in parallel versions of their systems that were able to take the work of a single complex SQL query and split it into multiple, parallel paths of execution. When a DBMS with parallel query capabilities is installed on a four-way or eight-way SMP system, a query that might have taken two hours on a single-processor system can be completed in less than an hour. Companies are taking advantage of this hardware- based performance boost in two ways: either by obtaining business analysis results in a fraction of the time previously required or by leaving the timeframe constant and carrying out much more complex and sophisticated analysis.

Operating system support for new hardware features (such as multiprocessor architectures) has often lagged the availability of the hardware capabilities—often by several quarters or even years. This has posed a special dilemma for DBMS vendors,
who need to decide whether to bypass the operating system in an attempt to improve database performance. The Sybase DBMS, for example, when originally introduced, operated as a single process and took responsibility for its own task management, event handling, and input/output—functions that are usually handled by an operating system such as UNIX or VMS. In the short term, this gave Sybase a major performance advantage over rival DBMS products with less parallel processing capability.

But when operating system SMP support arrived, many of its benefits were automatically available to rival systems that had relied on the operating system for task management, while Sybase had the continuing burden of extending and enhancing its low-level performance-oriented software. This cycle has played out for SMP designs, with major database vendors now relying on operating systems for thread support and SMP scaling. But the same trade-offs continue to apply to new hardware features as they appear and require explicit strategic decisions on the part of the DBMS vendors.

Today, the quest for higher and higher database performance certainly shows no signs of stopping. With today’s highest-performance servers featuring hundreds of multigigahertz processors, hardware advances have more than overcome the higher overhead of the relational data model, giving it performance equal to, or better than, the best nonrelational databases of the past. At the same time, of course, the demand for higher and higher transaction rates against larger and larger databases continues to grow. At the top end of the database market, it appears that one can never have too much database performance.

5. Database Server Appliances

Another hardware-based market trend in the 1980s and early 1990s was the emergence of companies that combined high-performance microprocessors, fast disk drives, and multiprocessor architectures to build dedicated systems that were optimized as database servers. These vendors argued that they could deliver much better database performance with a specially designed database engine than with a general-purpose computer system. In some cases, their systems included application-specific integrated circuits (ASICs) that implement some of the DBMS logic in hardware for maximum speed. Dedicated database systems from companies such as Teradata and Sharebase (formerly Britton-Lee) found some acceptance in applications that involve complex queries against very large databases. However, they have not become an important part of the mainstream database market, and these vendors eventually disappeared or were acquired by larger, general-purpose computer companies.

Interestingly, the notion of a packaged, all-in-one database server appliance was briefly rekindled at the end of the 1990s by Oracle Corporation and its CEO, Larry Ellison. Ellison argued that the Internet era had seen the success of other all-in-one products, such as networking equipment and web cache servers. Oracle announced partnerships with several server hardware vendors to build Oracle-based database appliances. Over time, however, these efforts had little market impact, and Oracle’s enthusiasm for database appliances faded from media attention.

Several venture-backed startups have recently embraced the idea of database server appliances once again, this time in the form of database caching servers that reside in a network between the application and an enterprise database. These startups point to the widespread success of web page caching within the Internet architecture, and posit a similar opportunity for data caching. Unlike web pages, however, database contents tend to have an inherent transactional character, which makes the synchronization of cache contents with the main database both much more important (to insure that requests satisfied by the database cache come up with the right response) and much more difficult. Whether the notion of a database caching appliance will catch on or not remains an open question as of this writing.

6. Benchmark Wars

As SQL-based relational databases have moved into the mainstream of enterprise data processing, database performance has become a critical factor in DBMS selection. User focus on database performance, coupled with the DBMS vendors’ interest in selling high-priced, high-margin, high-end DBMS configurations, has produced a series of benchmark wars among DBMS vendors. Virtually all of the DBMS vendors have joined the fray at some point over the last decade. Some have focused on maximum absolute database performance. Others emphasize price/performance and the cost-effectiveness of their DBMS solution. Still others emphasize performance for specific types of database processing, such as OLTP or OLAP. In every case, the vendors tout benchmarks that show the superior performance of their products while trying to discredit the benchmarks of competitors.

The early benchmark claims focused on vendor-proprietary tests, and then on two early vendor-independent benchmarks that emerged. The Debit/Credit benchmark simulated simple accounting transactions. The TP1 benchmark, first defined by Tandem, measured basic OLTP performance. These simple standardized benchmarks were still easy for the vendors to manipulate to produce results that cast them in the most favorable light.

In an attempt to bring more stability and meaning to the benchmark data, several vendors and database consultants banded together to produce standardized database benchmarks that would allow meaningful comparisons among various DBMS products. This group, called the Transaction Processing Council, defined a series of official OLTP benchmarks, known as TPC-A, TPC-B, and TPC-C. The Council has also assumed a role as a clearinghouse for validating and publishing the results of benchmarks run on various brands of DBMS and computer systems. The results of TPC benchmarks are usually expressed in transactions per minute (e.g., tpmC), but it’s common to hear the results referred to simply by the benchmark name (e.g., “DBMS Brand X on hardware Y delivered 10,000 TPC-Cs”).

The most recent TPC OLTP benchmark, TPC-C, attempts to measure not just raw database server performance, but the overall performance of a client/server configuration. Modern multiprocessor workgroup-level servers are delivering thousands
or tens of thousands of transactions per minute on the TPC-C test. Enterprise-class UNIX-based SMP servers are delivering multiple tens of thousands of tpmC. The maximum results on typical commercially available systems (a multimillion-dollar 64-bit Alpha processor cluster) exceed 100,000 tpmC.

The Transaction Processing Council has branched out beyond OLTP to develop benchmarks for other areas of database performance. The TPC-D benchmark focuses on data warehousing applications. The suite of tests that comprise TPC-D are based on a database schema typical of warehousing environments, and they include more complex data analysis queries, rather than the simple database operations more typical of OLTP environments. Interestingly, the TPC benchmarks specify that the size of the database must increase as the claimed number of transactions per minute goes up. A TPC benchmark result of 5000 tpmC may reflect results on a database of hundreds of megabytes of data, for example, while a result of 20,000 tpmC on the same benchmark may reflect a test on a multigigabyte database. This provision of the TPC benchmarks is designed to add more realism to the benchmark results since the size of database and computer system needed to support an application with demands in the 5000 tpm range is typically much smaller than the scale required to support an application with 20,000 tpm demands.

In addition to raw performance, the TPC benchmarks also measure database price/ performance. The price used in the calculation is specified by the council as the five-year ownership cost of the database solution, including the purchase price of the computer system, the purchase price of the database software, five years of maintenance and support costs, and so on. The price/performance measure is expressed in dollar-per-TPC (e.g., “Oracle on a Dell four-way server broke through the $500-per- TPC-C barrier”). While higher numbers are better for transactions-per-minute results, lower numbers are better for price/performance measures.

Over the last several years, vendor emphasis on TPC benchmark results have waxed and waned. The existence of the TPC benchmarks, and the requirement that published TPC results be audited, have added a level of integrity and stability to benchmark claims. It appears that benchmarking and performance testing will be part of the database market environment for some time to come. In general, benchmark results can help with matching database and hardware configurations to the rough requirements of an application. On an absolute basis, small advantages in benchmark performance for one DBMS over another will probably be masked by other factors.

7. SQL Standardization

The adoption of an official ANSI/ISO SQL standard was one of the major factors that secured SQL’s place as the standard relational database language in the 1980s. Compliance with the ANSI/ISO standard has become a checkoff item for evaluating DBMS products, so each DBMS vendor claims that its product is compatible with or based on the ANSI/ISO standard. Through the late 1980s and early 1990s, all of the popular DBMS products evolved to conform to the parts of the standard that
represented common usage. Other parts, such as the module language, were effectively ignored. This produced slow convergence around a core SQL language in popular DBMS products.

As discussed in Chapter 3, the SQL1 standard was relatively weak, with many omissions and areas that are left as implementation choices. For several years, the standards committee worked on an expanded SQL2 standard that remedies these weaknesses and significantly extends the SQL language. Unlike the first SQL standard, which specified features that were already available in most SQL products, the SQL2 standard, when it was published in 1992, was an attempt to lead rather than follow the market. It specified features and functions that were not yet widely implemented in current DBMS products, such as scroll cursors, standardized system catalogs, much broader use of subqueries, and a new error message scheme. DBMS vendors are still in the process of evolving their products to support the full features of SQL2. In practice, proprietary extensions (such as enhanced support for multimedia data or stored procedures or object extensions) have often been more important to a DBMS vendor’s success than higher levels of SQL2 compliance.

The progress of the SQL standards groups continued, with work on a SQL3 standard begun even before the SQL2 standard was published. As delays set in and the number of different areas to be addressed by the next standard grew, the work on SQL3 was divided into separate, parallel efforts, focused on the core of the language, a Call-Level Interface (CLI), persistent stored modules (stored procedures), distributed transaction capabilities, time-based data, and so fourth. Some of these efforts were published a few years later as enhancements to the 1992 SQL2 standard. A SQL2-compatible CLI standard was released in 1995, as SQL-CLI. A year later, in 1996, a standardized stored procedure capability was released as SQL-PLM. In 1998, object language bindings for SQL were standardized in the SQL-OLB specification. A basic set of OLAP capabilities were published in a SQL-OLAP standard in 2000.

While progress continued on these additions to the SQL2 standard, the work on the core language part of SQL3 (called the foundation part of the standard) focused on how to add object capabilities to SQL2. This quickly became a very controversial activity. Relational database theorists and purists took a strong stand against many of the proposed extensions. They claimed that the proposals confuse conceptual and architectural issues (e.g., adding substructure beyond the row/column tables) with implementation issues (e.g., performance issues of normalized databases and multitable joins). Proponents of the proposed SQL3 object extensions pointed to the popularity of object-oriented programming and development techniques, and insist that the rigid row/column structure of relational databases must be extended to embrace object concepts or it would be bypassed by the object revolution. Their argument was bolstered in the marketplace as the major relational DBMS vendors added object-oriented extensions to their products, to blunt the offensive from pure object-oriented databases, and were largely successful with this strategy.

The controversy over the SQL3 work was finally resolved after a seven-year effort, with the publication of the SQL:1999 standard. (The term SQL3, which was used during the development of the standard, has now been replaced by the official term SQL:1999.) The SQL:1999 standard is structured in a series of parts:

  • Part 1: Framework. Describes the overall goals and structure of the standard, and the organization of its other parts.
  • Part 2: Foundation. Is the main body of the standard, focused on the SQL language itself. SQL statements and clauses, transaction semantics, database structure, privileges, and similar capabilities are specified here. This part also contains the object-oriented extensions to SQL.
  • Part 3: Call-Level Interface. Contains the SQL-CLI (1995) extensions to the SQL-92 standard, updated to conform to SQL:1999.
  • Part 4: Persistent Stored Modules. Similarly contains the SQL-PSM (1996) extensions to the SQL-92 standard, updated to conform to SQL:1999.
  • Part 5: Host Language Bindings. Deals with the interactions between procedural host languages (such as C or COBOL) and SQL.
  • Part 9: Management of External Data. Describes how a SQL-based database should manage data external to the database itself.
  • Part 10: Object Language Bindings. Deals with the same issues as Part 5, but for object-oriented languages.

Some parts of the standard are still under development at this writing, as indicated by the missing part numbers. In addition, other SQL-related standardization efforts have broken off into their own, separate standards activities. A separate standard is under development for SQL-based handling of multimedia data, such as full-text documents, audio, and video content. This is itself a multipart standard; some parts have already been published. Another separate standard makes official the embedded SQL for Java work known as SQLJ.

In the progression from SQL1 to SQL2, and then to SQL:1999, the official ANSI/ISO SQL standards have ballooned in scope. The original SQL1 standard was less than 100 pages; the Framework section (Part 1) of the SQL:1999 standard alone is nearly that large. The Foundation section of the SQL:1999 standard runs well over 1000 pages, and the currently published parts, taken together, run over 2000 pages. The broadly expanded scope of the SQL:1999 standard reflects the wide usefulness and applicability of SQL, but the challenge of implementing and conforming to such a voluminous set of standards is very formidable, even for large DBMS vendors with large development staffs.

It’s worth noting that the SQL:1999 standard takes a very different approach to standards conformance claims than the SQL1 and SQL2 standards. The SQL2 standard defined three levels of conformance, Entry, Intermediate, and Full, and laid out the specific features of the standard that must be implemented to claim conformance at each level. In practice, DBMS vendors found some features at each level to be important to their customers, and others relatively unimportant. So virtually all current SQL implementations claim some form of compliance with SQL2, but very few, if any, implement all of the features required for formal Intermediate or Full conformance.

With this experience in mind, the SQL:1999 standards group instead defined only one Core SQL level of conformance, which corresponds roughly to the Entry level of SQL2 plus selected features from the Intermediate and Full levels. Beyond this Core SQL, additional features are grouped together in packages, to which conformance can individually be claimed. There is a package for the SQL-CLI capabilities, one for SQL- PSM, one for enhanced data integrity functions, one for enhanced date and time functions, and so on. This structure allows individual DBMS vendors to pick and choose the areas of the standard that are most important to the particular markets they serve, and makes conformance to parts of the standard more practical.

At this writing, the SQL:1999 standard is too new to fully gauge its impact on the DBMS market. If the experience with SQL2 is any guide, vendors will carefully evaluate individual new pieces of SQL:1999 functionality and seek feedback from their customer base about which ones are useful. With the very large new functionality required by SQL:1999 features such as user-defined types and recursive queries, implementation of some parts of SQL:1999 will be a multiyear project for even the largest DBMS vendors. In practice, the SQL1 (SQL-89) standard defines the core SQL capabilities supported by virtually all products; the SQL2 (SQL-92) standard represents the current state of the art in large enterprise database products, and the SQL:1999 standard is a roadmap for future development.

In addition to the official SQL standard, IBM’s and Oracle’s SQL products will continue to be a powerful influence on the evolution of SQL. As the developer of SQL and a major influencer of corporate IS management, IBM’s SQL decisions have always had a major impact on other vendors of SQL products. Oracle’s dominant market position has given it similar clout when it has added new SQL features to its products. When the IBM, Oracle, and ANSI SQL dialects have differed in the past, most independent DBMS vendors have chosen to follow the IBM or Oracle standards.

The likely future path of SQL standardization thus appears to be a continuation of the history of the last several years. The core of the SQL language will continue to be highly standard. More features will slowly become a part of the core, and will be defined as add-on packages or new standards in their own right. Database vendors will continue to add new, proprietary features in an ongoing effort to differentiate their products and offer customers a reason to buy.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *