Effect of reindex on read/replica
jeanid last edited by
I have a posgresql db running with a primary / read (1) configuration. The primary is basically used by a service which updates the db on a daily basis. The read is used for user queries.
When I run some updates on the primary, the performances on the read don't appear to be impacted. However, if I run a REINDEX on a table on the primary then the read queries are literally stopped for the duration of the transaction on the primary, which can be quite long given that it is a 50M+ rows DB with 100GB of data. So our API is basically interrupted for a few 10s of minutes, which is obviously not sustainable.
It's as if the read has a full read/write lock on my table. According to the postgres manual, reindex shouldn't even lock on read the table on the primary node... I might be naive but I thought work done on the primary wouldn't affect the read performances, minus when the wal files are being sent which I assume is done in a pretty compact and fast way. Is there a parameter that needs tuning that could explain that behaviour?
My postgres version
PostgreSQL 11.10 on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
It is working on a Kubernetes cluster (AWS EKS) and I installed it via a helm chart from bitnami. Let me know if there are any parameter you need to see and which ones.
Yep, that is sadly how it works. The index itself is strongly locked while it is rebuilt. The lock is replayed to the standby. The planning of the queries on the standby are now blocked, as they can't figure out if the index might be useful to them without getting at least a weak lock on it, and they block doing that. (You could imagine it doing something else, like just ignoring the unlockable index. But you could also imagine the havoc that that might cause.)
If the clients on the standby use prepared queries which already have a plan in mind and that plan doesn't use that index, they will not generally be blocked. (There are things that might cause them to discard their plans and attempt to re-plan, at which point they too will block)
If you upgrade to at least v12, then you could use the REINDEX...CONCURRENTLY feature instead.
I don't think any of this is special to replication, you get the same dynamics if it was querying directly on the master.