Question

Let's imagine my application needs to do a series of consecutive SELECTs (to different tables) to collect various bits of information from the db, and we don't want any of those tables to change while we're collecting the data. With postgresql one can use "repeatable read" or "serializable" isolation levels, however if another transaction commits changes to a table we haven't referenced yet we'll see them even if our transaction had started already, as the following sequence of actions shows:

T1: BEGIN ISOLATION LEVEL SERIALIZABLE;    # imagine here table t has 10 rows
T2: INSERT INTO t VALUES(1, 2, 3);
T1: SELECT COUNT(*) FROM t;                # we'll see 11 lines for the rest of the transaction

However, if T1 accessed t before T2 did the insert, it would see 10 lines for the duration of the whole transaction:

T1: BEGIN ISOLATION LEVEL SERIALIZABLE;    # imagine here table t has 10 rows
T1: SELECT COUNT(*) FROM t;                # we'll see 10 lines for the rest of the transaction
T2: INSERT INTO t VALUES(1, 2, 3);
T1: SELECT COUNT(*) FROM t;                # still sees 10 lines etc.

With the above behavior, if we need to access multiple tables during the transaction, many of them may change in the time interval between the beginning of the transaction and the moment we access them, and we'll see those changes which we don't want to see. I understand that's the way isolation levels are supposed to work, so no explanation is needed here.

But then, is there a way to have some kind of "snapshot" starting at a given point in time? Are explicit locks needed in this case?

Was it helpful?

Solution

That is not a problem.

The serializable transaction (you should use repeatable read in this case, it performs better and does the same) effectively starts when you execute the first select, and all statements in the same transaction will see the same snapshot of the database.

So all is consistent.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top