How to gauge performance of switching to PostgreSQL table partitioning?



  • I have a table in a PostgreSQL database containing about 100m records that's mostly read-only with a few monthly writes to update new data.

    It has over 300 columns, and is heavily indexed for specific queries, but it's becoming difficult to maintain quick read performance as the data grows and as user queries change.

    The table is about equally divided by a column with 20 unique values, and that value is almost always used as a filter in most queries, so I'm investigating the cost/benefit of migrating that table to a partitioned table based on that column.

    However, I'm reading that the process of migrating an existing table to partitioning can be very complicated, https://stackoverflow.com/questions/71074729/how-to-migrate-a-postgresql-table-to-partition-table-referenced-by-foreign-keys , which I do.

    So before I invest the time, is there a way to ballpark the performance change of partitioning a table?



  • Is the dataset greater than RAM, but by less than 20 fold? Can you arrange it so that for one time period, you query mostly just one value of that column, then the next time period, mostly another value, and so on? If so, you should be able to get better cache usage.

    If not, then it is hard to see where the increased performance would come from. Maybe seeing an actual example query would help.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2