문제

Given the following SQL pseudo-code:

... open transaction ...
SELECT * FROM users WHERE users.id = 5
... some calculation here ...
INSERT INTO users (id) VALUES (5)
... commit transaction ...

Is REPEATABLE_READ isolation guaranteed to prevent concurrent connections from inserting Users[id=5] between the query and insert, or do I need to use SERIALIZABLE isolation for that?

NOTE: I'm looking for a database agnostic solution if possible. If not, please provide empirical evidence from multiple databases so we can work against some sort of industry consensus.

UPDATE: The original question contained an incorrect query that made use of range-locks. Since the purpose of this question is to focus on queries that do not require range-locks, I have updated the question.

도움이 되었습니까?

해결책 2

No. The REPEATABLE_READ isolation level only guarantees that you will see the same data if you repeat the SELECT within the same transaction, meaning that (SQL) implementations could hide from your transaction insertions performed by other transactions.

The only isolation level able to preclude insertions is SERIALIZABLE because it guarantees that if you have multiple transactions operating concurrently, the results will be as if they had run sequentially in certain order. The specific order in each case is not specified nor reported, but it's guaranteed that there exists one that would have produced the same final results.

One of the conditions of the question is a database-agnostic (and I assume high-level answer), so I will not mention how the different isolation levels work in different DB implementations or how they translate into locks.

다른 팁

REPEATABLE_READ doesn't block the table. It guarantees that the transaction see the same rows at any point. Let me explain it by an example:

Time  Transaction 1                    Transaction2
 1    Begin Tx 1                        
 2                                     Begin Tx 2
 4    Select count(*) from my_tab;        
 5                                     Select count(*) from my_tab;
 6    Insert into ... my_tab;
 7    Commit;
 8                                     Select count(*) from my_tab;
 9                                     Insert into ... my_tab;
 10                                     Select count(*) from my_tab;
 11                                    Commit;
 12   Begin Tx3
 13   Select count(*) from my_tab;

If my_tab has 10 rows then the result of the count will be:

  • Time 4 : 10 Rows
  • Time 5 : 10 Rows
  • Time 8 : 10 Rows because table is in repeateble_read mode if the transaction mode is read_commited also will be 10 rows. But if the Tx is set en read_uncommited the number of rows will be 11.
  • Time 10: Since it is in reapeateble read mode ther the count will be 11 rows (ten originals plus one of the insert in the current transaction). If the tx mode is read_commited the number of rows will be 12 (ten originals plus one insert of the tx1 and one insert of the current tx).
  • Time 13: Here the number of rows will be 12 for all transaction modes.

In serializable mode every transaction is executed in single mode, so while a transaction doesn't ends with commit or rollback no one transaction can start. If this mode guarantees the consistency of database have serious performance issues.

EDIT

From SQL-92 Standard (pages 67 and 68) hosted in http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt (obtained from wikipedia):

An SQL-transaction has an isolation level that is READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, or SERIALIZABLE. The isolation level of an SQL-transaction defines the degree to which the opera- tions on SQL-data or schemas in that SQL-transaction are affected by the effects of and can affect operations on SQL-data or schemas in concurrent SQL-transactions. The isolation level of a SQL- transaction is SERIALIZABLE by default. The level can be explicitly set by the .

The execution of concurrent SQL-transactions at isolation level SERIALIZABLE is guaranteed to be serializable. A serializable exe- cution is defined to be an execution of the operations of concur- rently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions. A serial exe- cution is one in which each SQL-transaction executes to completion before the next SQL-transaction begins.

The isolation level specifies the kind of phenomena that can occur during the execution of concurrent SQL-transactions. The following phenomena are possible:

1) P1 ("Dirty read"): SQL-transaction T1 modifies a row. SQL- transaction T2 then reads that row before T1 performs a COMMIT. If T1 then performs a ROLLBACK, T2 will have read a row that was never committed and that may thus be considered to have never existed.

2) P2 ("Non-repeatable read"): SQL-transaction T1 reads a row. SQL- transaction T2 then modifies or deletes that row and performs a COMMIT. If T1 then attempts to reread the row, it may receive the modified value or discover that the row has been deleted.

3) P3 ("Phantom"): SQL-transaction T1 reads the set of rows N that satisfy some . SQL-transaction T2 then executes SQL-statements that generate one or more rows that satisfy the used by SQL-transaction T1. If SQL-transaction T1 then repeats the initial read with the same , it obtains a different collection of rows.

The four isolation levels guarantee that each SQL-transaction will be executed completely or not at all, and that no updates will be lost. The isolation levels are different with respect to phenomena P1, P2, and P3. Table 9, "SQL-transaction isolation levels and the three phenomena" specifies the phenomena that are possible and not possible for a given isolation level.

           Level                P1         P2         P3
    ---------------------------------------------------------
    | READ UNCOMMITTED     | Possible | Possible | Possible |
    ---------------------------------------------------------
    | READ COMMITTED       | Not      |          |          |
    |                      | Possible | Possible | Possible |
    ---------------------------------------------------------
    | REPEATABLE READ      | Not      | Not      |          |
    |                      | Possible | Possible | Possible |
    ---------------------------------------------------------
    | SERIALIZABLE         | Not      | Not      | Not      |
    |                      | Possible | Possible | Possible |
    ---------------------------------------------------------------
    |                                                             |
    | Note: The exclusion of these phenomena or SQL-transactions  |
    |       executing at isolation level SERIALIZABLE is a        |
    |       consequence of the requirement that such transactions |
    |       consequence of the be serializable.                   |
    ---------------------------------------------------------------

EDIT 2

Ok you're interested in locks and blocks, at this point how the RDMS providers implement this can vary (SQL Server by example is too weird) but for a big picture this can be useful, the following explanation is apply when you are trying to modify/read the same data (rows). There is two types of locks when the RDMS is executing a transaction:

  1. Shared Locks: This kind of lock let other transactions to access to the data. By example many transactions can read the same data simultaneously.
  2. Exclusive Locks: This kind of lock block the resource avoiding other transaction to access it, by example two transactions can't modify at the same time the same data, the transaction that tries to modify the resource must check if there is no exclusive locks over the resource. If there is no exclusive lock over the resource then the transaction gets the exclusive lock and until it release the lock no one transaction can obtain the lock and must wait for release of the exclusive lock.

The second point here is: What is locked ?. Usually there is two types of locks:

  1. Table Level Locks: When a transaction is trying to modify data the transaction get the exclusive lock of all table, other transactions must wait for release of locks by the transaction.
  2. Row Level Locks: When a transaction is trying to modify data the transaction get the exclusive locks of the rows involved in the transaction, other transactions that want to modify the same subset of rows must wait until the first transaction ends.

A note respect to Dead Lock, imagine the following scenario:

Time  Transaction 1                        Transaction 2
 1    Begin Tx 1                        
 2                                         Begin Tx 2
 4    Update table X set... where id = 5        
 5                                         Update table X set ... where id = 5;
 6    Update table X set... where id = 6
 7    Commit;
 8                                         Commit;

If the database is configured at row level lock then Transaction 2 will wait in Time 5 until Time 7 this because Transaction 1 get the exclusive lock first. Now imagine the following example:

Time  Transaction 1                        Transaction 2
 1    Begin Tx 1                        
 2                                         Begin Tx 2
 4    Update table X set... where id = 5        
 5                                         Update table X set ... where id = 6;
 6    Update table X set... where id = 6
 7                                         Update table X set ... where id = 5;
 8    Commit;
 9                                         Commit;

This scenario is know as 'Dead Lock' because in Time 6 the Transaction 1 will wait for the release of lock gained by the Transaction 2 on Time 5 but in the Time 7 the Transaction 2 mus wait for the lock gained by Transaction 1 in Time 4. The different RDBMS manages the dead lock in different ways by example MySQL with InnoDB will raise dead lock exception for Transaction 2 letting the Transaction 1 finish fine.

There is some interesting articles:

Best regards

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top