Concurrency Control Techniques in Database

To avoid concurrency related problems and to maintain consistency, some rules (protocols) need to be made so that system can be protected from such situations and increase the efficiency.

1. Lock-Based Protocols

To maintain consistency, restrict modification of the data item by any other transaction which is presently accessed by some transaction. These restrictions can be applied by applying locks on data items. To  access any data  item, transaction have to  obtain lock on  it.

1.1. Granularity of Locks

The level and type of information that the lock protects is called locking granularity. In other words, the size of data items that the data manager locks is called the locking granularity. The granularity of locks in a database refers to how much of the data is locked at one time. In theory, a database server can lock as much as the entire database or as little as one column of data. The level of locking used depends on the typical access needs of transactions as well as consideration of performance tradeoffs. Locking granularity affects performance since more work is necessary to maintain and coordinate the increased number of locks. To achieve optimum performance, a locking scheme must balance the needs of concurrency and  overhead.  Locking can  occur  on  the following  levels:

  • Coarse Granularity (Table or File or Database Locks): The data manager could lock at a coarse granularity such as tables or If the locks are at coarse granularity, the data manager doesn’t have to set many locks, because each lock covers a great amount of data. Thus, the overhead of setting and releasing locks is low. However, by locking large chunks of data, the data manager is usually locking more data than a transaction needs. For example, even if a transaction T accesses only a few records of a table, a data manager that locks at the granularity of tables will lock the whole table, thereby preventing other transactions from locking any other records of the table, most of which are not needed by transaction T. This reduces the number of transactions that can run concurrently, which both reduces the throughput and increases the response  time  of  transactions.
  • Fine Granularity (Records or Fields Locks): The data manager could lock at a fine granularity, such as records or If the locks are at fine granularity, the data manager only locks the specific data actually accessed by a transaction. These locks do not artificially interfere with other transaction, as coarse grain locks do. However, the data manager must now lock every piece of data accessed by a transaction, which can generate much locking overhead. For example, if a transaction issues an SQL query that accesses tens of thousands of records, a data manager that do record granularity locking would set tens of thousands of locks, which can be quite costly. In addition to the record locks, locks on associated indexes are also needed, which compounds the problem. There is a fundamental tradeoff between amount of concurrency and locking overhead, depending on the granularity of locking. Coarse- grained locking has low overhead but low concurrency. Fine-grained locking has high concurrency but high  overhead.
  • Intermediate Granularity (Pages Locks): The data manager could lock at an intermediate granularity, such as A disk-page or page is the equivalent of a disk-block, which may be described as a section of a disk. A page has a fixed size, such as 4K, 8K, 16K, etc. A table may span several pages, and a page may contain several rows of one or more tables. Page-level locks are the most frequently used of the multi-user DBMS locking. This gives a moderate degree of concurrency with a moderate amount of locking overhead. It works well in systems that do not need to run at high transaction rates, and hence are unaffected by the reduced concurrency, or ones where transactions frequently access many records per page so that page locks are not artificially locking more data than transactions actually access. It also simplifies the recovery algorithms for Commit and Abort. However, for high performance Transaction Processing, record locking is needed, because there are too many cases where concurrent transactions need to lock different records on the same page.

1.2. Concurrency Control Manager (CCM)

The concurrency control manager synchronizes the concurrent access of database transactions to shored objects in the database. It is responsible for maintaining the guarantees regarding the effects of concurrent access to the shored database. It will make sure that the result of transactions running in parallel is identical to the result of some serial execution of the same transactions.

The Concurrency Control Manager grant locks to different transactions. Concurrency Control manager  is  basically  a  programme.

1.3. Types of Lock

There are two  types of locks  that can be  implemated on  a transaction.

  • Shared lock : Shared lock is Read-Only If a transaction Ti has obtained shared lock on data item A then Ti can read A but cannot modify A. It is denoted by S and Ti  is  said  to  be  in  Shared  lock  mode.
  • Exclusive lock : Exclusive lock is Read-Write If a transaction Ti  has obtained exclusive lock on data item A then Ti can both read and modify (write) A. It is denoted by X and Ti is said to be in Exclusive lock mode.

1.4. Compatible Lock Modes

Suppose transaction Ti  has obtained a lock on data item V in mode A. If transaction Tj can also obtain lock on V in mode B then, A is compatible with B or these two lock modes are compatible. Matrix of compatibility of shared and Exclusive lock modes are shown in Figure 8.6.

The table shows that if a transaction Ti    has shared lock on data item V then any other transaction Tj can obtain only shared lock on V at the same time. All other possible combinations are  incompatible.

S –  lock(A)    —     Shared  lock on  data  item A

    X – lock(A) — Exclusive lock on data item A

         read(A) — Read operation  on A

         write(A) — Write operation on A

      unlock(A) — A is free now.

A transaction Ti can hold only those items which are used in it. If any data item V is not free then Ti is made to wait until V is released. It is not desirable to unlock data item immediately after  its  final  access.

Example. Consider the banking system in which you have to transfer 500 from account A to account B. Transaction T1 in Figure 8.7 shows transfer of amount and T2 shows sum of accounts A and B. Initially account A has 1000 and B has 300.

FIGURE 8.7. Transactions T1 and T2 for banking example.

The possible schedule for T1 and T2 is shown in Figure 8.8.

In the schedule shown in Figure 8.8, T2 gives wrong result i.e., 800 instead of  1300  because  unlock A is  performed immediately  after  its final  access.

1.5. Improvement in Basic Structure of Locking

Keep all locks until the end of transaction. The modified transaction T1 and T2 are shown in Figure  8.9.

The transaction T2 cannot obtain lock on A until A is released by T1 but at that time changes are  made  in  both  accounts.

Advantages : The  advantages of locking  are:

  1. In maintains
  2. It solves all the concurrency

Disadvantages : The  disadvantages of locking  are:

  1. It leads to a new problem  that  is Deadlock.

Two possible schedules S1 and S2 for transactions T1 and T2 of Figure 8.9 are shown in Figure 8.10(a),  (b).

In S1, right result is obtained but S2 leads to deadlock because both T1 and T2 are in waiting state and wait for release of B and A respectively. (Shared mode and exclusive mode are incompatible).

1.6. Two-phase Locking Protocol

Two-phase locking protocol guarntees serializability. In this protocol, every transaction issue lock and  unlock  requests  in  two  phases.

  • Growing phase : In this phase, transaction can obtain new locks but cannot release any lock.
  • Shrinking or contracting phase : In this phase, transaction can release locks, but cannot obtain new lock.

Lock Point : In growing phase, the point at which transaction has obtained its final lock is called Lock Point.

In two phase locking protocol, all transactions can be ordered according to their lock points. Transactions  T1 and T2 in  Figure  8.9 follows  two  phase locking  protocol.

  1. Basic Two phase Locking Protocol : The technique described above is known as basic two phase locking Protocol.

Advantages : The advantages of Basic two phase locking protocol are:

    1. It ensures serializability

Disadvantages : The disadvantages of  basic two phase locking  protocol are:

    1. It may cause deadlock.
    2. It may cause cascading rollback.
  1. Strict Two phase Locking Protocol : In strict two phase locking protocol, all exclusive locks are kept  until  the  end  of transaction  or  until  the  transaction

Advantages : The advantages of strict 2PL protocol are:

    1. There is no cascading rollback in strict two phase locking protocol.

Example. Consider partial schedules S1 (in basic two-phase locking) and S2 (in strict two- phase locking)  as  shown in  Figure  8.11.

In S1 there is cascading rollback of T4 due to rollback of T3 but in S2, all exclusive lock are kept till end and avoid cascading rollback as shown  in Figure 8.11(a), (b).

  1. Rigorous Two-phase Locking Protocol : In rigorous two phase locking protocol, all locks are kept until the transaction It means both shared and exclusive locks cannot be released till  the  end  of  transaction.

1.7.  Lock Conversion

To improve efficiency of two-phase locking protocol, modes of lock can be converted.

  • Upgrade : Coversion  of shared  mode  to exclusive mode  is  known as upgrade.
  • Downgrade : Conversion of exclusive mode to shared mode is known as downgrade.

Consider transaction T5 as shown in Figure 8.12.

In T5, an exclusive lock of A is needed after sometime so first obtain shared lock on A and then upgrade it. During this period any other transaction can also obtain shared lock of A and hence  increase efficiency.

2. Graph Based Protocols

In graph based protocols, an additional information is needed for each transaction to know, how they will access data items of database. These protocols are used where two-phase protocols are not applicable but they required additional information that is not required in two-phase protocols.  In graph  based protocols  partial  ordering is  used.

Consider, a set of data items D = d1, d2, …, di, dj, …, dz}. If di dj then, if any transaction T want  to  access  both  di and dj then  T  have  to  access  di first then  dj.

Here, tree protocol is used in which D is viewed as a directed acyclic graph known as database graph.

Rules for tree protocol:

  1. Any transaction T can  lock  any data  item  at first time.
  2. After first lock if T wants to lock any other data item v then, first, T have to lock its parent.
  3. T cannot release all exclusive locks before commit or abort.
  4. Suppose T has obtained lock on data item v and then release it after Then T cannot obtain any lock on v again.

All the schedules under graph based protocols maintain serializability. Cascadelessness and recoverability are maintained by rule (3) but this reduces concurrency and hence reduce efficiency.

Alternate approach : Here rule (3) is modified. Modified version is (3) T can release any lock at any time.

But in alternate approach there may be cascading rollback.

For each data item, if any transaction performed the last write to some data item and still uncommitted  then  record  that  transaction.

Commit dependency : A transaction Ti has commit dependency upon Tj if Ti wants to read a uncommitted data item v and Tj is the most recent transaction which performed write operation on  v.

If any  of  these  transactions  aborts,  Ti  must  also  be  aborted.

Consider the tree structured database graph as shown in Figure 8.13.

Suppose we want to operate

G = G +  10  and H = H – 5

then possible  transaction is  as shown  in Figure  8.14.

Advantages : The advantages of graph based protocols are:

  1. There is no deadlock.
  2. Unlike in two-phase, data items can be unlocked at any time, that reduces waiting time.
  3. More concurrency.

Disadvantages : Those items have to be locked that have no use in transactions. This results in increased locking overhead. (locking of A, B and E to lock H in T6 of Figure 8.14).

3. Timestamp-based Protocols

In two-phase and graph based protocols, conflict transactions are found at run time. To order them in  advance,  use  timestamp  based  protocols.

Note: (Conflict transactions are  those transactions  that need  same data  items during  execution).

For each transaction Ti, assign a unique fixed number or timestamp before it starts execution. It is denoted by TS(Ti). For any two transactions Ti and Tj, TS(Ti) > TS(Tj) or TS(Tj) < TS(Ti) but never be equal. Timestamps can be given by using:

  • System clock : System clock time is equal to timestamp of transaction.
  • Logical counter : Value of counter is equal to timestamp of Counter is incremented every time  after  assigning  new  timestamp.

Two timestamps are associated with each data item v of database. These are:

  • Write–TS(v) : Write timestamp of any data item is equal to the timestamp of most recent transaction that  executes  write(v)  successfully and  write–TS(v).
  • Read–TS(v) : Read timestamp of any data item is equal to the timestamp of most recent transaction that executed  read(v)  successfully and  read–TS(v).

3.1.  The Timestamp-Ordering Protocol

To order conflicting read and write operations use the following rules:

1.    For any  transaction  Ti,  to execute  read(v):
  • If TS(Ti) < write–TS(v) then read(v) is rejected and rollback Ti because Ti tries to read the old  value  of which  is  overwritten  by any  other
  • If TS(Ti) > write–TS(v), then read(v) is executed and read–TS(v) is set to maximum of TS(Ti) and  read–TS(v).
2.    For any  transaction  Ti,  to  execute  write(v):
  • If TS(Ti) < read-TS(v) then Ti tries to write that value of v which was used previously by other transaction. Write(v) is rejected and Ti is rollbacked (to avoid inconsistent analysis problem).
  • If TS(Ti) < write-TS(v) then Ti try to write obsolete value of v. So, write(v) is rejected and Ti is rollbacked  (to avoid  Lost –  Update problem).
  • Otherwise execute write-TS(v).

Advantages : The advantages of time  stamp based protocols are:

  1. It maintains serializability.
  2. It is free from deadlock.
  3. It is free from cascading rollbacks.

Disadvantages : The disadvantages of time  stamp based protocols are:

  1. Starvation is possible for long transactions which were forcefully restarted due to conflicting short transactions.

To avoid starvation, block conflicting transactions for sometime to allow completion of long transactions.

  1. Schedules are not recoverable.

3.2. Thomas Write Rule

To allow more concurrency in timestamp based protocol timestamp ordering protocol is slightly modified  and is  known as  Thomas Write  Rule.

In Thomas Write Rule, rules for read operation remains unchanged, there is slight variation in write operation, that is as follows:

For any  transaction  Ti, to  execute  write(v).

  1. Same as in timestamp ordering
  2. If TS(Ti) < write-TS(v) then Ti try to write obsolete value of v. So, no need to rollback Ti, just ignored  the  write

Consider Figure 8.15 which shows the schedule for Transactions T7 and T8 and TS(T7)<TS(T8).

So, T7 can successfully complete read(v) operation. Then, T8 can successfully complete write(v) operation. But when T7 attempts write(v) operation, it is rejected because TS(T7) < Write-TS(v) [write-TS(v) = TS(T8)].

But, according to timestamp ordering protocol, T7 is rolled back which is not required. Since the value of v which T7 wants to write will never be used.

  • For any transaction Ti such that TS(Ti )<TS(T8) that want to read the value of v [read (v)] will be rejected and rolled back because TS(Ti)<write-TS(v).
  • For any transaction Ti such that TS(Ti)>TS(T8) that want to execute read(v) must read the value written by T8  because read(v) in  T7 is  not possible  in any

So, ignore or delete read(v) in T7.

4. Validation Based Protocols

Validation Based Protocols are used where most of the transactions are read-only and conflicts are low. In these protocols, every transaction Ti have to go through the following phases dependending upon its type :

  • Read Phase : In read phase Ti reads all the data items and store them in temporary variables (local variables of Ti). After reading, all the write operations are made on the temporary variables instead of actual Here changes are local and leaved in actual database.
  • Validation Phase : In validation phase, a validation test is performed to determine whether changes in actual database can be
  • Write Phase : If Ti cleared the validation test then actual database is changed according to temporary Here actual changes are made in database. Validation test ensures violation free execution of transactions. To determine the time to perform validation test, use timestamps. Each transaction Ti is associated with three timestamps. These are:
    • Start(Ti) : It gives time when Ti starts execution.
    • Validation(Ti) : It gives time when Ti finished its read phase and starts its validation phase.
    • Finish(Ti) : It gives time when Ti finished execution or write If any transaction Ti failed in validation test then it is aborted and rollback.

To clear the validation test by transaction Tj, for all transactions such that TS(Ti) < Tj, Tj must satisfy one of the following conditions:

  • If Finish(Tj) < Start(Tj) then there is  no conflict because  they are in  serial order.
  • If Start(Tj) < Finish(Ti) < Validation(Tj). This condition ensures that actual writes to database for Ti and Tj does not overlap.

Example. Consider  the schedule  as  shown in  Figure  8.16.

Suppose, initially A = 10 and B = 5 then T9 display A+B = 15 because T10 first write new values of A and B to local variables and validate after T9 finished its execution.

Advantages : The advantages of validation based protocols are:

  1. It maintains serializability
  2. It is free from cascading rollback.
  3. Less overhead then other protocols.

Disadvantages : The disadvantages of validation based protocols are:

  1. Starvation of long transactions due  to  conflicting short  transactions.

5. Multiversion Concurrency Control Protocols

Concurrency techniques discussed so far are suffering from delays for read operations, even sometimes they are rejected. To overcome this disadvantage, multiversion protocols are used that maintain  different  versions  of  data  items.

In multiversion  protocols  each write  operation  write(v) creates  a  new version  of  v.

When any transaction wants to read v then control manager selects the appropriate version of  for  read(v).

It increases the performance because now there is no  need to delay read operations.

5.1. Multiversion Timestamp Ordering

In multiversion techniques, timestamps can be used for ordering transactions. Each transaction Ti  is  associated  with  a unique  and  fixed  timestamp  TS(Ti).

Each data  item  Vn (is  version of  V)  has Three  fields:

  • Content : This is the value of data item for a particular
  • Write-TS(Vn) : This is equal to the timestamp of the transaction that created Vn.
  • Read-TS(Vn) : This is equal to the timestamp of most recent transaction that successfully executes read(Vn).

Multiversion Timestamp ordering ensures serializability. Consider a transaction Ti and Vn be any data item of version n, whose write timestamp is less than or equal to TS(Ti).

  • Read(Vn) by Ti : Contents of Vn are returned to read request.
  • Write(Vn) by Ti : Ti rolls back if TS(Ti) < Read–TS(Vn). If TS(Ti) = Write–TS(Vn) then a new version  of  V  is  created  with  new contents.

Database never maintains all versions of V, old versions of V those are no longer needed are removed  from  database.

Suppose two versions of V exists, i.e., Vm and Vn both having write stamp less than timestamp of oldest transaction of system then delete the older of two versions Vm and Vn.

Advantages : Any read request never waits  and never fails.

Disadvantages The disadvantages  of  multiversion time  stamp or  during  are:

  1. Overhead due to updation of read timestamp for every read request.
  2. Conflicting transactions result rollbacks.

5.2.  Multiversion Two-phase Locking

In multiversion two-phase locking, advantages of both multiversion concurrency control techniques and two-phase locking technique are combined. It is also helpful to overcome the disadvantages of  multiversion  timestamp  ordering.

Single timestamp is given to every version of each data item. Here timestamp counter (TS-counter) is used instead of system clock and logical counter. Whenever a transaction commits, TS-counter  is  incremented  by  1.

Read only transactions are based upon multiversion timestamp ordering. These transactions are associated with  timestamp equal  to the  value of  TS-counter.

Updations are based upon rigrous two-phase locking protocol.

Source: Gupta Satinder Bal, Mittal Aditya (2017), Introduction to Basic Database Management System, 2nd Edition-University Science Press (2017)

Leave a Reply

Your email address will not be published. Required fields are marked *