SQL Transaction Processing: Versioning *

The locking techniques described in the preceding sections are the most widely used techniques for supporting concurrent multiuser transaction processing in relational DBMS products. Locking is sometimes called a pessimistic approach to concurrency, because by locking parts of the database, the DBMS is implicitly assuming that concurrent transactions will probably interfere with one another. In recent years, a different approach to concurrency, called versioning, has been implemented in some DBMS products and has been increasing in popularity. Versioning is sometimes called an optimistic approach to concurrency because in this approach, the DBMS implicitly assumes that concurrent transactions will not interfere with one another.

With a locking (pessimistic) architecture, the DBMS internally maintains one and only one copy of the data for each row in the database. As multiple users access the database, the locking scheme arbitrates the access to this row of data among the users (more precisely, among their concurrent transactions). In contrast, with a versioning (optimistic) architecture, the DBMS will create two or more copies of the data for a row in the database when a user attempts to update the row. One copy of the row will contain the old data for the row, before the update; the other copy of the row will contain the new data for the row, after the update. The DBMS internally keeps track of which transactions should see which version of the row, depending on their isolation levels.

1. Versioning in Operation *

Figure 12-17 shows a simple versioning architecture in action. Transaction A starts the action, reading a row of the PRODUCTS table, and finding 139 units of ACI-41004 size 4 widgets available. Transaction B comes along next and updates the same row, reducing the quantity available to 39 units. In response, the DBMS internally creates a new copy of the row. From this point on, if Transaction B rereads the contents of the row, the contents will come from this new copy, since it reflects Transaction B’s updated quantity on hand (39 units). Next, Transaction C comes along and tries to read the same row. Because Transaction B’s update has not yet been committed, the DBMS gives Transaction C the data from the old copy of the row, showing 139 units available. The same thing happens a few seconds later for Transaction D; it will also see 139 units available. Now Transaction B performs a COMMIT operation, making its update of the row permanent. A short time later, Transaction E attempts to read the row. With Transaction B’s update now committed, the DBMS will give Transaction E the data from the new copy, showing 100 units. Finally, Transactions C, D, and E end their database activity with a COMMIT operation.

The activity shown in Figure 12-17 meets the serializability requirement for proper DBMS operation. The sequential transaction series A-C-D-B-E would produce the same results shown in the Figure. (In fact, the series A-D-C-B-E would also produce these results.) Furthermore, the versioning implementation delivers the correct operation without causing any of the transactions to wait. This is not true of the typical locking implementation, as shown in Figure 12-18.

In Figure 12-18, Transaction A again starts the action, finding 139 units of ACI-41004 widgets available. Internally, the DBMS places a shared lock on the row. Transaction B next tries to update the row, reducing quantity on hand to 39 units. If Transaction A is operating at a strict isolation level (such as REPEATABLE READ), Transaction B will be held at this point, because it cannot acquire the required exclusive lock. If Transaction A is operating at a less strict isolation level, the DBMS can allow Transaction B to proceed, giving it an exclusive lock on the row and actually updating the data. The internal row in the database (recall that there is only a single copy of the row in this locking architecture) now shows 39 units available. When Transaction C comes along, it must wait for Transaction B to release its lock unless Transaction C is operating at a very low (READ UNCOMMITTED) isolation level. The same is true of Transaction D. Only after Transaction B has committed its changes can Transactions C and D proceed.

Comparing the operations in Figures 12-17 and 12-18, two differences are worth noting. First, and more fundamentally, the versioning approach in Figure 12-17 allows more concurrent transactions to proceed in parallel. The locking approach in Figure 12-18 will, under most circumstances, cause two or more transactions to wait for others to complete and free their locks. The second, and more subtle, difference is that the effective order of serial transaction execution is different between the two figures. As noted, in Figure 12-17, the transaction sequence A-C-D-B-E produces the results. In Figure 12-18, the sequence A-B-C-D-E produces the results. Note that neither is correct; the serializability principle states only that the results produced by the DBMS must match some sequence of serial transactions.

The example in Figure 12-17 includes only one updating transaction, so only two copies of the updated row are required (before and after the update). The versioning architecture is easily extended to support more concurrent updates. For each attempt to update the row, the DBMS can create another new row, reflecting the update. With this multiversioned approach, the task of keeping track of which transactions should see which version of the row obviously becomes more complex. In practice, the decision about which version of the row should be visible to each concurrent transaction depends not only on the sequence of database operations, but also on the isolation levels requested by each of the transactions.

Versioning does not completely eliminate the possibility of deadlocks within the database. The two transactions in Figure 12-13, with their interleaved attempts to update two different tables, each in a different order, will still produce problems, even for a versioning scheme. However, for workloads with a mix of database READ operations and database UPDATE operations, versioning can significantly reduce the locking and lock timeouts or deadlocks associated with shared locks.

2. Versioning Advantages and Disadvantages *

The advantage of a versioning architecture is that, under the right circumstances, it can significantly increase the number of concurrent transactions that can execute in parallel. Concurrent execution is becoming more and more important in large DBMS installations, especially those that support web sites that may have thousands or tens of thousands of concurrent users. Versioning is also becoming more useful as the number of processors on typical DBMS server computer systems increases. Servers containing 16 or more processors are becoming increasingly common, and large DBMS servers may support 64 or more processors in a symmetric multiprocessing (SMP) configuration. These servers can actually execute many database-access applications in parallel by spreading the workload out over many processors.

The disadvantage of a versioning architecture is the internal DBMS overhead that it creates. One obvious overhead is the added memory requirement of storing two or more copies of rows that are being updated. In practice, a more serious overhead is the memory management required to allocate memory for each temporary copy of a row as it is needed (potentially thousands of times per second), and then releasing the memory to be reused when the older copies of the row are no longer needed. An additional overhead is keeping track of which transactions should see which copies of which rows.

Implicitly, a versioning architecture is based on the underlying assumption that most concurrent transactions will tend not to interfere with one another. If this assumption proves accurate (i.e., if concurrently executing transactions mostly access and update different rows, or if the transaction workload is dominated by READ operations rather than UPDATEs), then the added overhead of the versioning scheme will be more than offset by a significant boost in the amount of parallel work that can be performed. If the assumption proves inaccurate (i.e., if concurrently executing transactions tend to access and update the same rows), then the overhead of the versioning technique will tend to become very high, swamping the concurrency gains.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *