Ringstorage: DELETE based on table usage
I am very new to LARGE DATABASE related stuff and Postgres, We plan to have a Postgres db with a million of rows that can grow up to Terabytes because rows include binary data (images).
We plan to implement two features of automatic deletion:
- When timestamps of rows are older than (x) days
- When database size on disk reaches a configured limit (like "8 TB")
I struggle with the second feature. If I understand correctly, only
VACUUM FULLreclaims disk space, but it also blocks the table, that's not possible because constantly rows are inserted and there is no time slot for that. Normal
VACUUMjust marks the rows as dead and does not reclaim disk space.
My program monitors the table size. When it reaches say 95% (usage) of the configured limit I delete some rows and do a normal
VACUUM. The table size doesn't shrink and on next check it detects that the table size is over 95% (usage) again.
I am curious about if there are better solutions to this problem?
After deletion and vacuum, you can check with https://www.postgresql.org/docs/current/pgfreespacemap.html to see how much reusable space there is. (You will probably need to check the toast table rather than the main table since most of the size is in videos).
Or you can partition and then drop the oldest partition either when it is too old or when the total space is too high. This will return space to the OS immediately, and will be much less IO and CPU load than deleting individual tuples and vacuuming all the time.