Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transaction isolation levels relation with locks on table

People also ask

What is the isolation level to be used to lock only the current row?

Cursor stability (CS) - The cursor stability isolation level locks only the current row which the program is accessing. As soon as the program shifts to the next row, the lock in the previous row gets released. The cursor stability fetches only committed rows for the program to access.

Why does locking prevent dirty reads?

The transaction releases read locks when it moves off the current row. It holds write locks until it is committed or rolled back. The transaction waits until rows write-locked by other transactions are unlocked; this prevents it from reading any "dirty" data.

What is locking explain the types and the levels of locking?

At the table level, there are 5 different types of locks. i.e, Exclusive (X), Shared (S), Intent exclusive (IX), Intent shared (IS), and Shared with intent exclusive (SIX) and these locks have already been discussed above. This type of locking level is not suitable for multi-user database management systems.


I want to understand the lock each transaction isolation takes on the table

For example, you have 3 concurrent processes A, B and C. A starts a transaction, writes data and commit/rollback (depending on results). B just executes a SELECT statement to read data. C reads and updates data. All these process work on the same table T.

  • READ UNCOMMITTED - no lock on the table. You can read data in the table while writing on it. This means A writes data (uncommitted) and B can read this uncommitted data and use it (for any purpose). If A executes a rollback, B still has read the data and used it. This is the fastest but most insecure way to work with data since can lead to data holes in not physically related tables (yes, two tables can be logically but not physically related in real-world apps =\).
  • READ COMMITTED - lock on committed data. You can read the data that was only committed. This means A writes data and B can't read the data saved by A until A executes a commit. The problem here is that C can update data that was read and used on B and B client won't have the updated data.
  • REPEATABLE READ - lock on a block of SQL(which is selected by using select query). This means B reads the data under some condition i.e. WHERE aField > 10 AND aField < 20, A inserts data where aField value is between 10 and 20, then B reads the data again and get a different result.
  • SERIALIZABLE - lock on a full table(on which Select query is fired). This means, B reads the data and no other transaction can modify the data on the table. This is the most secure but slowest way to work with data. Also, since a simple read operation locks the table, this can lead to heavy problems on production: imagine that T table is an Invoice table, user X wants to know the invoices of the day and user Y wants to create a new invoice, so while X executes the read of the invoices, Y can't add a new invoice (and when it's about money, people get really mad, especially the bosses).

I want to understand where we define these isolation levels: only at JDBC/hibernate level or in DB also

Using JDBC, you define it using Connection#setTransactionIsolation.

Using Hibernate:

<property name="hibernate.connection.isolation">2</property>

Where

  • 1: READ UNCOMMITTED
  • 2: READ COMMITTED
  • 4: REPEATABLE READ
  • 8: SERIALIZABLE

Hibernate configuration is taken from here (sorry, it's in Spanish).

By the way, you can set the isolation level on RDBMS as well:

  • MySQL isolation level,
  • SQL Server isolation level
  • Informix isolation level (Personal Note: I will never forget about SET ISOLATION TO DIRTY READ sentence.)

and on and on...


As brb tea says, depends on the database implementation and the algorithm they use: MVCC or Two Phase Locking.

CUBRID (open source RDBMS) explains the idea of this two algorithms:

  • Two-phase locking (2PL)

The first one is when the T2 transaction tries to change the A record, it knows that the T1 transaction has already changed the A record and waits until the T1 transaction is completed because the T2 transaction cannot know whether the T1 transaction will be committed or rolled back. This method is called Two-phase locking (2PL).

  • Multi-version concurrency control (MVCC)

The other one is to allow each of them, T1 and T2 transactions, to have their own changed versions. Even when the T1 transaction has changed the A record from 1 to 2, the T1 transaction leaves the original value 1 as it is and writes that the T1 transaction version of the A record is 2. Then, the following T2 transaction changes the A record from 1 to 3, not from 2 to 4, and writes that the T2 transaction version of the A record is 3.

When the T1 transaction is rolled back, it does not matter if the 2, the T1 transaction version, is not applied to the A record. After that, if the T2 transaction is committed, the 3, the T2 transaction version, will be applied to the A record. If the T1 transaction is committed prior to the T2 transaction, the A record is changed to 2, and then to 3 at the time of committing the T2 transaction. The final database status is identical to the status of executing each transaction independently, without any impact on other transactions. Therefore, it satisfies the ACID property. This method is called Multi-version concurrency control (MVCC).

The MVCC allows concurrent modifications at the cost of increased overhead in memory (because it has to maintain different versions of the same data) and computation (in REPETEABLE_READ level you can't loose updates so it must check the versions of the data, like Hiberate does with Optimistick Locking).

In 2PL Transaction isolation levels control the following:

  • Whether locks are taken when data is read, and what type of locks are requested.

  • How long the read locks are held.

  • Whether a read operation referencing rows modified by another transaction:

    • Block until the exclusive lock on the row is freed.

    • Retrieve the committed version of the row that existed at the time the statement or transaction started.

    • Read the uncommitted data modification.

Choosing a transaction isolation level does not affect the locks that are acquired to protect data modifications. A transaction always gets an exclusive lock on any data it modifies and holds that lock until the transaction completes, regardless of the isolation level set for that transaction. For read operations, transaction isolation levels primarily define the level of protection from the effects of modifications made by other transactions.

A lower isolation level increases the ability of many users to access data at the same time, but increases the number of concurrency effects, such as dirty reads or lost updates, that users might encounter.

Concrete examples of the relation between locks and isolation levels in SQL Server (use 2PL except on READ_COMMITED with READ_COMMITTED_SNAPSHOT=ON)

  • READ_UNCOMMITED: do not issue shared locks to prevent other transactions from modifying data read by the current transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current transaction from reading rows that have been modified but not committed by other transactions. [...]

  • READ_COMMITED:

    • If READ_COMMITTED_SNAPSHOT is set to OFF (the default): uses shared locks to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed. [...] Row locks are released before the next row is processed. [...]
    • If READ_COMMITTED_SNAPSHOT is set to ON, the Database Engine uses row versioning to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions.
  • REPETEABLE_READ: Shared locks are placed on all data read by each statement in the transaction and are held until the transaction completes.

  • SERIALIZABLE: Range locks are placed in the range of key values that match the search conditions of each statement executed in a transaction. [...] The range locks are held until the transaction completes.


The locks are always taken at DB level:-

Oracle official Document:- To avoid conflicts during a transaction, a DBMS uses locks, mechanisms for blocking access by others to the data that is being accessed by the transaction. (Note that in auto-commit mode, where each statement is a transaction, locks are held for only one statement.) After a lock is set, it remains in force until the transaction is committed or rolled back. For example, a DBMS could lock a row of a table until updates to it have been committed. The effect of this lock would be to prevent a user from getting a dirty read, that is, reading a value before it is made permanent. (Accessing an updated value that has not been committed is considered a dirty read because it is possible for that value to be rolled back to its previous value. If you read a value that is later rolled back, you will have read an invalid value.)

How locks are set is determined by what is called a transaction isolation level, which can range from not supporting transactions at all to supporting transactions that enforce very strict access rules.

One example of a transaction isolation level is TRANSACTION_READ_COMMITTED, which will not allow a value to be accessed until after it has been committed. In other words, if the transaction isolation level is set to TRANSACTION_READ_COMMITTED, the DBMS does not allow dirty reads to occur. The interface Connection includes five values that represent the transaction isolation levels you can use in JDBC.