Inconsistent data when combining (LEFT) JOIN with SELECT FOR UPDATE



  • I have recently stumbled upon a seemingly odd behavior of SELECT ... FOR UPDATE when combined with (LEFT) JOIN. Here is the table structure as well as a scenario to reproduce the result:

    Table Structure

    create table counter (
      counter_id serial primary key,
      current_counter int not null default 0
    );
    

    create table diff (
    diff_id serial primary key,
    diff_increase int not null default 0,
    counter_id serial references counter(counter_id) not null
    );

    Scenario

    There are two concurrent transactions A & B, both performing the same queries.

    1. Transaction A starts with that query and is able to acquire the lock and proceed.
    select *
      from counter
      left join diff on counter.counter_id = diff.counter_id
     where counter.counter_id = 1
     order by diff.diff_id desc
     limit 1
       for update of counter
    ;
    
    1. Transaction B tries to perform the same query but cannot acquire a lock and therefore waits.

    2. Transaction A will do the following queries:

    update counter
       set current_counter = current_counter + 100
     where counter_id = 1
    ;
    

    insert into diff (diff_increase, counter_id) values (100, 1)
    ;

    commit;

    1. Transaction A has completed and the state of the database should now be the following:
    -- counter table
    counter_id | current_counter
    ------------------------------
    1          | 200
    

    -- diff table
    diff_id | diff_increase | counter_id

    1 | 50 | 1
    2 | 50 | 1
    3 | 100 | 1

    Expected Behavior

    Transaction B sees the updated counter (current_counter = 200) and the last diff (diff_id = 3).

    Actual Behavior

    Transaction B continues with the new state of the counter table (meaning current_counter = 200) while the diff_id is still 2 instead of 3.

    Is this behavior expected? If so, why does one and the same query see different states of the database? Does this not violate the guarantees of the READ COMMITTED isolation level?

    Tested with PostgreSQL 13 on Linux.



  • In essence, concurrently updated rows are taken into account, but concurrently inserted rows are not.

    In default READ COMMITTED isolation level, each command can only see rows that have been committed before it began. UPDATE adds new row versions, not new rows.

    The inserted row (diff_id = 3) in your example is not visible to the concurrent transaction that started before the row had been committed. Taking a lock with FOR UPDATE has no bearing on the visibility of rows. But the new row version added by the concurrent UPDATE will be taken into account.

    Essential quotes from https://www.postgresql.org/docs/current/transaction-iso.html :

    1.

    [...] a SELECT query (without a FOR UPDATE/SHARE clause) sees only data committed before the query began;

    UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time.

    The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row.

    The whole chapter is a recommended read.

    The best course of action for this case might be https://www.postgresql.org/docs/current/transaction-iso.html#XACT-SERIALIZABLE




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2