PostgreSQL replication lag without obvious bottleneck
We have PostgreSQL 14.2 running on Ubuntu 18.04 with physical (asynchronous) replication set up. The standby is on an AWS EC2 instance with an EBS volume. When I test inserting data as fast as possible replication lag keeps growing and I'm unsure what the bottleneck is. In the latest case, I ran a transaction that inserted a lot of data, finishing over 2 days ago, with no other activity since, and the standby is still trying to catch up - replication lag hit a peak of 75GB and is now down to 29GB.
When I run
repmgr node statusmultiple times the last received LSN is not changing, only the last replayed LSN:
Last received LSN: 2DE/86000000 Last replayed LSN: 2D7/A7B75278
repmgr node status
Last received LSN: 2DE/86000000
Last replayed LSN: 2D7/B35D0328
I can also see this directly with a query:
select application_name, client_addr, state, pg_wal_lsn_diff(pg_current_wal_insert_lsn(), replay_lsn) from pg_stat_replication;
This returns state=streaming and pg_wal_lsn_diff=~29G.
nethogsalso shows virtually no network activity. This implies to me that there is no problem in receiving data from the primary and the bottleneck is in replaying the changes, but I'm not sure what it is. CPU usage is ~20% (constant). Memory usage is 10 GB out of 15.5 GB. iotop shows reads of 10-20 MB/s and writes of 10-15 MB/s (sometimes with peaks of up to 70 MB/s for "actual disk write").
I've already tried changing the EBS volume type from gp2 with 1500 IOPS to gp3 with 3000 IOPS and this didn't seem to make any difference. How do I figure out what the bottleneck is here and fix it?
The log file on the standby doesn't have any errors that I can see. There are many messages like this:
2022-03-07 06:16:15.814 UTC  LOG: restartpoint complete: wrote 345 buffers (2.1%); 0 WAL file(s) added, 27 removed, 0 recycled; write=269.658 s, sync=0.532 s, total=270.456 s; sync files=19, longest=0.167 s, average=0.028 s; distance=443765 kB, estimate=531639 kB 2022-03-07 06:16:15.814 UTC  LOG: recovery restart point at 2D6/C8237E70 2022-03-07 06:16:15.814 UTC  DETAIL: Last completed transaction was at log time 2022-03-04 23:47:59.635064+00.
The latest is
2022-03-07 08:27:09.653 UTC  DETAIL: Last completed transaction was at log time 2022-03-05 00:02:47.410576+00.
so it is still over 2 days behind!
You are measuring the difference to
replay_lsn. If there is a delay, that doesn't mean that there is a lag in transferring WAL; for that, you'd have to use
If there is a lag in replay, but not in WAL transfer, that means that replay is delayed by a https://max_standby_streaming_delay . You can change that behavior by tuning