PostgreSQL replication lag without obvious bottleneck



  • We have PostgreSQL 14.2 running on Ubuntu 18.04 with physical (asynchronous) replication set up. The standby is on an AWS EC2 instance with an EBS volume. When I test inserting data as fast as possible replication lag keeps growing and I'm unsure what the bottleneck is. In the latest case, I ran a transaction that inserted a lot of data, finishing over 2 days ago, with no other activity since, and the standby is still trying to catch up - replication lag hit a peak of 75GB and is now down to 29GB.

    When I run repmgr node status multiple times the last received LSN is not changing, only the last replayed LSN:

        Last received LSN: 2DE/86000000
        Last replayed LSN: 2D7/A7B75278
    

    repmgr node status
    ...
    Last received LSN: 2DE/86000000
    Last replayed LSN: 2D7/B35D0328

    I can also see this directly with a query:

    select application_name, client_addr, state, pg_wal_lsn_diff(pg_current_wal_insert_lsn(), replay_lsn) from pg_stat_replication;
    

    This returns state=streaming and pg_wal_lsn_diff=~29G.

    nethogs also shows virtually no network activity. This implies to me that there is no problem in receiving data from the primary and the bottleneck is in replaying the changes, but I'm not sure what it is. CPU usage is ~20% (constant). Memory usage is 10 GB out of 15.5 GB. iotop shows reads of 10-20 MB/s and writes of 10-15 MB/s (sometimes with peaks of up to 70 MB/s for "actual disk write").

    I've already tried changing the EBS volume type from gp2 with 1500 IOPS to gp3 with 3000 IOPS and this didn't seem to make any difference. How do I figure out what the bottleneck is here and fix it?

    The log file on the standby doesn't have any errors that I can see. There are many messages like this:

    2022-03-07 06:16:15.814 UTC [30320] LOG:  restartpoint complete: wrote 345 buffers (2.1%); 0 WAL file(s) added, 27 removed, 0 recycled; write=269.658 s, sync=0.532 s, total=270.456 s; sync files=19, longest=0.167 s, average=0.028 s; distance=443765 kB, estimate=531639 kB
    2022-03-07 06:16:15.814 UTC [30320] LOG:  recovery restart point at 2D6/C8237E70
    2022-03-07 06:16:15.814 UTC [30320] DETAIL:  Last completed transaction was at log time 2022-03-04 23:47:59.635064+00.
    

    The latest is

    2022-03-07 08:27:09.653 UTC [30320] DETAIL: Last completed transaction was at log time 2022-03-05 00:02:47.410576+00.

    so it is still over 2 days behind!



  • You are measuring the difference to replay_lsn. If there is a delay, that doesn't mean that there is a lag in transferring WAL; for that, you'd have to use write_lsn or flush_lsn.

    If there is a lag in replay, but not in WAL transfer, that means that replay is delayed by a https://max_standby_streaming_delay . You can change that behavior by tuning max_standby_streaming_delay.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2