Autovacuum on tables with high-volume updates
I have a database with two tables where every 5 minutes the entire content of those tables is updated (mainly just a single column). The biggest of the two tables has about 170K rows. The autovacuum seems to run quite frequently on those tables, yet their disk size becomes very big (2 GB, lots of dead rows) and after a few weeks the whole database performance is noticeably degraded (higher loads, higher read/write latency, higher IOPS). A
VACUUM FULLsolves the issue, restoring the tables to an acceptable size and performances are back to the expected levels.
Now, I understand that this type of use is not ideal for autovacuum. And it doesn't look like there are long running queries blocking the autovacuum. Yet I have a few questions:
- Why the disk increase? I would expect that autovacuum is able to mark dead rows as "deleted" and reuse the disk space instead of a continuous growth.
- What parameters could I tweak to avoid such performance loss?
- How to better debug what is blocking autovacuum to properly cleanup things?
I also noticed the tables' statistics are quite off: the estimated number of rows is an order of magnitude inferior than the live rows count.
Here the output of
VACUUM VERBOSE ANALYZEon one of the two tables:
vacuuming "public.table1" index "leads_pkey" now contains 310038 row versions in 22625 pages index "leads_uuid_key" now contains 310038 row versions in 40682 pages "table1": found 0 removable, 280589 nonremovable row versions in 14781 out of 34929 pages vacuuming "pg_toast.pg_toast_2042410" index "pg_toast_2042410_index" now contains 0 row versions in 1 pages "pg_toast_2042410": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages analyzing "public.table1" "table1": scanned 30000 of 34929 pages, containing 148231 live rows and 121206 dead rows; 30000 rows in sample, 172585 estimated total rows
VACUUM FULL VERBOSE ANALYZEhas the output:
vacuuming "public.table1" "table1": found 0 removable, 487921 nonremovable row versions in 34929 pages analyzing "public.table1" "table1": scanned 13614 of 13614 pages, containing 171987 live rows and 315934 dead rows; 30000 rows in sample, 171987 estimated total rows
If you update all rows of one table in a single statement, the table will be bloated to twice its minimal size (it will contain a live tuple and a dead tuple for each row), and that bloat cannot be removed by autovacuum. If autovacuum doesn't finish fast enough, the next update will only increase the problem.
Ideas for a solution:
Run the update in batches, where each batch updates only a fraction of the rows. Between the updates, run an explicit
VACUUM. That will keep the bloat at bay.
Accept the bloat of 50%+, but configure autovacuum to run as fast as possible by setting
autovacuum_vacuum_cost_delay = 0and
maintenance_work_mem = '1GB'. That way, you have a chance that the table won't be bloated even more.
Deliberately create the table will a fillfactor of 45, so that it is deliberately bloated from the beginning, but make sure that the update only modifies columns that are not indexed. Then you will get https://www.cybertec-postgresql.com/en/hot-updates-in-postgresql-for-better-performance/ , which will at least reduce the need for