Inconsistent data when combining (LEFT) JOIN with SELECT FOR UPDATE
-
I have recently stumbled upon a seemingly odd behavior of
SELECT ... FOR UPDATE
when combined with(LEFT) JOIN
. Here is the table structure as well as a scenario to reproduce the result:Table Structure
create table counter ( counter_id serial primary key, current_counter int not null default 0 );
create table diff (
diff_id serial primary key,
diff_increase int not null default 0,
counter_id serial references counter(counter_id) not null
);
Scenario
There are two concurrent transactions A & B, both performing the same queries.
- Transaction A starts with that query and is able to acquire the lock and proceed.
select * from counter left join diff on counter.counter_id = diff.counter_id where counter.counter_id = 1 order by diff.diff_id desc limit 1 for update of counter ;
Transaction B tries to perform the same query but cannot acquire a lock and therefore waits.
Transaction A will do the following queries:
update counter set current_counter = current_counter + 100 where counter_id = 1 ;
insert into diff (diff_increase, counter_id) values (100, 1)
;commit;
- Transaction A has completed and the state of the database should now be the following:
-- counter table counter_id | current_counter ------------------------------ 1 | 200
-- diff table
diff_id | diff_increase | counter_id1 | 50 | 1
2 | 50 | 1
3 | 100 | 1Expected Behavior
Transaction B sees the updated counter (
current_counter = 200
) and the last diff (diff_id = 3
).Actual Behavior
Transaction B continues with the new state of the
counter
table (meaningcurrent_counter = 200
) while thediff_id
is still 2 instead of 3.Is this behavior expected? If so, why does one and the same query see different states of the database? Does this not violate the guarantees of the
READ COMMITTED
isolation level?Tested with PostgreSQL 13 on Linux.
-
In essence, concurrently updated rows are taken into account, but concurrently inserted rows are not.
In default
READ COMMITTED
isolation level, each command can only see rows that have been committed before it began.UPDATE
adds new row versions, not new rows.The inserted row (
diff_id = 3
) in your example is not visible to the concurrent transaction that started before the row had been committed. Taking a lock withFOR UPDATE
has no bearing on the visibility of rows. But the new row version added by the concurrentUPDATE
will be taken into account.Essential quotes from https://www.postgresql.org/docs/current/transaction-iso.html :
1.
[...] a
SELECT
query (without aFOR UPDATE
/SHARE
clause) sees only data committed before the query began;UPDATE
,DELETE
,SELECT FOR UPDATE
, andSELECT FOR SHARE
commands behave the same asSELECT
in terms of searching for target rows: they will only find target rows that were committed as of the command start time.The search condition of the command (the
WHERE
clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row.The whole chapter is a recommended read.
The best course of action for this case might be https://www.postgresql.org/docs/current/transaction-iso.html#XACT-SERIALIZABLE