Concurrency Control in SQL Server: Transactions

A transaction specifies a sequence of Transact-SQL statements that is used by database programmers to package together read and write operations, so that the database system can guarantee the consistency of data. There are two forms of transactions:

  • Implicit Specifies any single INSERT, UPDATE, or DELETE statement as a transaction unit
  • Explicit Generally, a group of Transact-SQL statements, where the beginning and the end of the group are marked using statements such as BEGIN TRANSACTION, COMMIT, and ROLLBACK

The notion of a transaction is best explained through an example. Suppose that in the sample database, the employee Ann Jones should be assigned a new employee number. The employee number must be modified in two different tables at the same time. The row in the employee table and all corresponding rows in the works_on table must be modified at the same time. (If only one of these tables were modified, data in the sample database would be inconsistent, because the values of the primary key in the employee table and the corresponding values of the foreign key in the works_on table for Ann Jones would not match.) Example 13.1 shows the implementation of this transaction using Transact-SQL statements.

Example 13.1

USE sample;

BEGIN TRANSACTION /* The beginning of the transaction */

UPDATE employee

SET emp_no = 39831

WHERE emp_no = 10102

IF (@@error <> 0)

ROLLBACK /* Rollback of the transaction */

UPDATE works_on

SET emp_no = 39831

WHERE emp_no = 10102

IF (@@error <> 0)

ROLLBACK

COMMIT /*The end of the transaction */

The consistent state of data used in Example 13.1 can be obtained only if both UPDATE statements are executed or neither of them is executed. The global variable @@error is used to test the execution of each Transact-SQL statement. If an error occurs, @@error is set to a negative value and the execution of all statements is rolled back. (The Transact-SQL statements BEGIN TRANSACTION, COMMIT, and ROLLBACK are defined in the upcoming section “Transact-SQL Statements and Transactions”)

NOTE The Transact-SQL language supports exception handling. Instead of using the global variable @@error, used in Example 13.1, you can use TRY and CATCH statements to implement exception handling in a transaction. The use of these statements is discussed in Chapter 8.

The next section explains the ACID properties of transactions. These properties guarantee that the data used by application programs will be consistent.

1. Properties of Transactions

Transactions have the following properties, which are known collectively by the acronym ACID:

  • Atomicity
  • Consistency
  • Isolation
  • Durability

The atomicity property guarantees the indivisibility of a set of statements that modifies data in a database and is part of a transaction. This means that either all data modifications in a transaction are executed or, in the case of any failure, all already executed changes are undone.

Consistency guarantees that a transaction will not allow the database to contain inconsistent data. In other words, the transactional transformations on data bring the database from one consistent state to another.

The isolation property separates concurrent transactions from each other. In other words, an active transaction can’t see data modifications in a concurrent and incomplete transaction. This means that some transactions might be rolled back to guarantee isolation.

Durability guarantees one of the most important database concepts: persistence of data. This property ensures that the effects of the particular transaction persist even if a system error occurs. For this reason, if a system error occurs while a transaction is active, all statements of that transaction will be undone.

2. Transact-SQL Statements and Transactions

There are six Transact-SQL statements related to transactions:

  • BEGIN TRANSACTION
  • BEGIN DISTRIBUTED TRANSACTION
  • COMMIT [WORK]
  • ROLLBACK [WORK]
  • SAVE TRANSACTION
  • SET IMPLICIT_TRANSACTIONS

The BEGIN TRANSACTION statement starts the transaction. It has the following syntax:

BEGIN TRANSACTION [ {transaction_name | @trans_var }

[WITH MARK [‘description’]]]

transaction_name is the name assigned to the transaction, which can be used only on the outermost pair of nested BEGIN TRANSACTION/COMMIT or BEGIN TRANSACTION / ROLLBACK statements. @trans_var is the name of a user-defined variable containing a valid transaction name. The WITH MARK option specifies that the transaction is to be marked in the log. description is a string that describes the mark. If WITH MARK is used, a transaction name must be specified. (The transaction log is discussed in detail later in this chapter, as well as in Chapter 16.)

The BEGIN DISTRIBUTED TRANSACTION statement specifies the start of a distributed transaction managed by the Microsoft Distributed Transaction Coordinator (DTC). A distributed transaction is one that involves databases on more than one server. For this reason, there is a need for a coordinator that will coordinate execution of statements on all involved servers. The server executing the BEGIN DISTRIBUTED TRANSACTION statement is the transaction coordinator and therefore controls the completion of the distributed transaction. (See Chapter 18 for a discussion of distributed transactions.)

The COMMIT WORK statement successfully ends the transaction started with the BEGIN TRANSACTION statement. This means that all modifications made by the transaction are stored on the disk. The COMMIT WORK statement is a standardized SQL statement. (The WORK clause is optional.)

NOTE The Transact-SQL language also supports the COMMIT TRANSACTION statement, which is functionally equivalent to COMMIT WORK, with the exception that COMMIT TRANSACTION accepts a user-defined transaction name. COMMIT TRANSACTION is an extension of Transact-SQL in relation to the SQL standard.

In contrast to the COMMIT WORK statement, the ROLLBACK WORK statement reports an unsuccessful end of the transaction. Programmers use this statement if they assume that the database might be in an inconsistent state. In this case, all executed modification operations within the transaction are rolled back. The ROLLBACK WORK statement is a standardized SQL statement. (The WORK clause is optional.)

NOTE Transact-SQL also supports the ROLLBACK TRANSACTION statement, which is functionally equivalent to ROLLBACK WORK, with the exception that ROLLBACK TRANSACTION accepts a user-defined transaction name.

The SAVE TRANSACTION statement sets a savepoint within a transaction. A savepoint marks a specified point within the transaction so that all updates that follow can be canceled without canceling the entire transaction. (To cancel an entire transaction, use the ROLLBACK statement.)

NOTE The SAVE TRANSACTION statement actually does not commit any modification operation; it only creates a target for the subsequent ROLLBACK statement with the label with the same name as the SAVE TRANSACTION statement.

Example 13.2 shows the use of the SAVE TRANSACTION statement.

Example 13.2

BEGIN TRANSACTION;

INSERT INTO department (dept_no, dept_name)

VALUES (‘d4’, ‘Sales’);

SAVE TRANSACTION a;

INSERT INTO department (dept_no, dept_name)

VALUES (‘d5’, ‘Research’);

SAVE TRANSACTION b;

INSERT INTO department (dept_no, dept_name)

VALUES (‘d6’, ‘Management’);

ROLLBACK TRANSACTION b;

INSERT INTO department (dept_no, dept_name)

VALUES (‘d7’, ‘Support’);

ROLLBACK TRANSACTION a;

COMMIT TRANSACTION;

The only statement in Example 13.2 that is executed is the first INSERT statement. The third INSERT statement is rolled back by the ROLLBACK TRANSACTION b statement, while the other two INSERT statements are rolled back by the ROLLBACK TRANSACTION a statement.

NOTE The SAVE TRANSACTION statement, in combination with the IF or WHILE statement, is a useful transaction feature for the execution of parts of an entire transaction. On the other hand, the use of this statement is contrary to the principle of operational databases that a transaction should be as short as possible, because long transactions generally reduce data availability.

As you already know, each Transact-SQL statement always belongs either implicitly or explicitly to a transaction. The Database Engine provides implicit transactions for compliance with the SQL standard. When a session operates in the implicit transaction mode, selected statements implicitly issue the BEGIN TRANSACTION statement. This means that you do nothing to start an implicit transaction. However, the end of each implicit transaction must be explicitly committed or rolled back using the COMMIT or ROLLBACK statement. (If you do not explicitly commit the transaction, the transaction and all the data changes that it contains are rolled back when the user disconnects.)

To enable an implicit transaction, you have to enable the IMPLICIT_TRANSACTIONS clause of the SET statement. This statement sets the implicit transaction mode for the current session. When a connection is in the implicit transaction mode and the connection is not currently in a transaction, executing any of the following statements starts a transaction:

In other words, if you have a sequence of statements from the preceding list, each statement will represent a single transaction.

The beginning of an explicit transaction is marked with the BEGIN TRANSACTION statement. The end of an explicit transaction is marked with the COMMIT or ROLLBACK statement. Explicit transactions can be nested. In this case, each pair of statements BEGIN TRANSACTION/COMMIT or BEGIN TRANSACTION/ROLLBACK is used inside one or more such pairs. (The nested transactions are usually used in stored procedures, which themselves contain transactions and are invoked inside another transaction.) The global variable @@trancount contains the number of active transactions for the current user.

BEGIN TRANSACTION, COMMIT, and ROLLBACK can be specified using a name assigned to the transaction. (The named ROLLBACK statement corresponds either to a named transaction or to the SAVE TRANSACTION statement with the same name.) You can use a named transaction only in the outermost statement pair of nested BEGIN TRANSACTION/ COMMIT or BEGIN TRANSACTION/ROLLBACK statements.

3. Transaction Log

Relational database systems keep a record of each change they make to the database during a transaction. This is necessary in case an error occurs during the execution of the transaction. In this situation, all previously executed statements within the transaction have to be rolled back. As soon as the system detects the error, it uses the stored records to return the database to the consistent state that existed before the transaction was started.

The Database Engine keeps all stored records, in particular the before and after values, in one or more files called the transaction log. Each database has its own transaction log. Thus, if it is necessary to roll back one or more modification operations executed on the tables of the current database, the Database Engine uses the entries in the transaction log to restore the values of columns that the database had before the transaction was started.

3.1. Before Images, After Images, and Write-Ahead Log

The transaction log is used to roll back or restore a transaction. If an error occurs and the transaction does not completely execute, the system uses all existing before values from the transaction log (called before images) to roll back all modifications since the start of the transaction. The process in which before images from the transaction log are used to roll back all modifications is called the undo activity.

Transaction logs also store after images. After images are modified values that are used to roll forward all modifications since the start of the transaction. This process is called the redo activity and is applied during recovery of a database. (For further details concerning transaction logs and recovery, see Chapter 16.)

Every entry written into the log is uniquely identified using the log sequence number (LSN). All log entries that are part of the particular transaction are linked together, so that all parts of a transaction can be located for undo and redo activities.

In case of a system failure, the Database Engine must be able to restore all data. If the data would be committed first, the data written to disk but not logged before failure could not be restored. For this reason, the Database Engine (and all other relational database systems) writes data changes to a log before the transaction is committed. This process is called write-ahead logging. Hence, the task of write-ahead logging is to provide high availability and consistency in case of failure.

On the other hand, heavy transaction log writes may become the bottleneck, and performance of the whole system can significantly suffer. In that case, the Database Engine supports an option to trade the (moderate) data loss for performance. This feature is called delayed durability and is discussed next.

3.2. Delayed Durability

Delayed durability enables transactions to continue running as if the data, prepared for logging, had been flushed to disk. Actually, all write operations to disk are deferred and are sent to disk together with write operations of other transactions. The system uses a 60KB chunk of log buffer and attempts to flush the log to disk when this 60KB block is full. Delayed durability can be set either at the database level or at the transaction level.

The ALTER DATABASE statement is used together with the DELAYED_DURABILITY option of the SET clause to set delayed durability at the database level. This option has three values: DISABLED, ALLOWED, and FORCED. ALLOWED means that any individual transaction can use delayed durability. FORCED means that all transactions that can use delayed durability will use it. (This can be useful in the case of an existing application where you want to use this mode throughout and also minimize the amount of code that has to be modified.)

If you want to apply delayed durability at the transaction level, use the COMMIT TRANSACTION statement in the following way:

COMMIT TRANSACTION WITH (DELAYED_DURABILITY = ON);

The main advantage of delayed durability can be achieved if your applications contain mainly short transactions and when the log disk is slow. Delayed durability should not be used if your transactions are usually long-running or if you have high throughput and high concurrency.

4. Editing Information Concerning Transactions and Logs

There are several dynamic management views that can be used to display different information concerning transactions and logs. This section describes two DMVs concerning transactions and two DMVs concerning transaction logs:

  • dm_tran_active_transactions
  • dm_tran_database_transactions
  • dm_db_log_space_usage
  • dm_db_log_stats (introduced in SQL Server 2017)

The sys.dm_tran_active_transactions DMV returns information about transactions of the particular Database Engine instance. The transaction_id column provides the unique ID number of each transaction, the transaction_begin_time column displays the starting time of each transaction, and the numerical values of the transaction_type column provide the type of the particular transaction. The most important values of the transaction_type column are 1 (for read/write transaction), 2 (for read/only transaction), and 3 (for system transaction). Finally, the name column specifies the transaction name.

The following example displays the unique IDs of all active transactions, their types, and the starting times.

Example 13.3

SELECT transaction_id ID, name,transaction_begin_time start,

transaction_type type

FROM sys.dm_tran_active_transactions;

The result is

I will only explain the values of the name column. You can observe that apart from user transactions (the last two rows of the result set), sys.dm_tran_active_transactions also lists worktable in the name column. Worktables are used when you need the tempdb system database for storing temporary result sets. (Note that the output will vary based on the state of your database.)

The sys.dm_tran_database_transactions DMV displays detailed information about the transactions occurring on your Database Engine instance. It provides snapshot data, so the results may vary each time the view is queried. The information from this DMV is similar to information you get using the sys.dm_tran_active_transactions view (see the previous example), but the sys.dm_tran_database_transactions view can provide a more granular level of detail about each transaction.

NOTE The name of this dynamic management view is a misnomer, because the DMV is server-scoped.

The most important columns of the sys.dm_tran_database_transactions DMV are transaction_id, database_id, database_transaction_type, and database_transaction_state.

The first two columns display the unique ID of the transaction and the unique database ID of the database, to which the particular transaction belongs, respectively. The database_ transaction_type column displays the type of a transaction, and the database_transaction_ state column displays the state of a transaction.

Several properties of transaction logs have an impact on the maintenance of databases. The total size of the log and the amount of space used are the most important parameters related to the performance of your system.

There are two ways to display the information concerning logs:

  • DBCC SQLPERF
  • dm_db_log_space_usage

The DBCC SQLPERF command displays statistics concerning transaction log space usage for all databases.

The sys.dm_db_log_space_usage DMV returns space usage information for the transaction log per database, as shown in Example 13.4.

Example 13.4

USE AdventureWorks;

SELECT DB_NAME(database_id) AS DBName,

ROUND(CONVERT(FLOAT,total_log_size_in_bytes/1024)/1024,2)

AS LogSize_in_MB,

ROUND(CONVERT(FLOAT,used_log_space_in_bytes/1024)/1024,2)

AS LogUsedSize_in_MB,

ROUND(used_log_space_in_percent,2) AS LogUsed_Percent

FROM sys.dm_db_log_space_usage;

The result is

DBName                LogSize_in_MB        LogUsedSize_in_MB          LogUsed_Percent

AdventureWorks        529.99               516.95                     97.54

The query in Example 13.4 displays the content of the total_log_size_in_bytes and used_ log_space_in_bytes columns, in megabytes. For this reason, both values are converted in the FLOAT data type, and then calculated in megabytes and rounded.

The sys.dm_db_log_stats DMV returns summary-level attributes and information on transaction log files of databases. You can use this information for monitoring and diagnostics of transaction log health. The query in Example 13.5 displays the last log backup times for the databases in my instance of the Database Engine.

Example 13.5

SELECT name AS ‘DBName’, log_backup_time AS ‘Last log backup time’

FROM sys.databases AS s

CROSS APPLY sys.dm_db_log_stats(s.database_id);

The result is

Example 13.5 “joins” the sys.dm_db_log_stats DMV with the sys.databases catalog view to get database names of the corresponding transaction logs. That way, the last log backup time can be provided for all existing databases.

As you can see from the result of Example 13.5, the backup of the transaction logs of the graph_db database and the sample database happened on March 30, 2019, while the backup of the transaction log of the master database has never been created on my instance.

Source: Petkovic Dusan (2020), Microsoft SQL Server 2019: A Beginner’s Guide, Seventh Edition-McGraw-Hill Education.

Leave a Reply

Your email address will not be published. Required fields are marked *