Postgres too Many References to Single Table is good or bad design



  • I have a Postgres database schema for a web app that needs real-time data and hence a huge volume of select, update, insert queries. Data manipulation is done via Postgres stored procedure mostly

    I have designed each table to have {created, updated, deleted}_by user all points to user table Id. In total, I will have around ~100-150 such references.

    In the users table, I see 100+ lines in reference by

        TABLE "XXX" CONSTRAINT "XXXXX" FOREIGN KEY (updated_user_id) REFERENCES users(id) ON DELETE SET NULL
        TABLE "XXY" CONSTRAINT "XXY" FOREIGN KEY (created_user_id) REFERENCES users(id)
    ...
    ...
    ...~150+
    

    I wanted to know if it's a good design or not and are there any pros and cons of such a database design. Are there any pros and cons of it I would really love to hear from all?

    Note my max data size/year will be < ~50GB/year

    Note I have these indexes on users table

    Indexes:
        "id_x" PRIMARY KEY, btree (id)
        "ABC" btree (id, username, email)
        "XYZ" UNIQUE, btree (lower(email::text))
        "aaa" UNIQUE, btree (lower(username::text))
        "bbb" UNIQUE CONSTRAINT, btree (email)
        "cccc" UNIQUE CONSTRAINT, btree (username)
    


  • It's the right design to enforce referential integrity.
    The only index relevant for the FKs is the one on (id) - the PK index.
    Your multicolumn index on (id, username, email) can still be useful if you join a lot and get https://wiki.postgresql.org/wiki/Index-only_scans out of it. Else, you might drop it.

    With ~ 150 FK references pointing to users(id) make sure the table and PK index are in good shape, i.e. use aggressive autovacuum settings to keep columns statistics and the visibility map up to date and bloat to a minimum. See:

    • https://dba.stackexchange.com/questions/21068/aggressive-autovacuum-on-postgresql/21649#21649

    Then again, since the users table is probably small, the impact is small accordingly.

    Deleting a user needs to update ~ 150 columns in other tables. The big amount of necessary locks may block other operations temporarily or even lead to deadlocks. If you never change users.id anyway and can live with the occasionally broken reference, you might just drop all these FK constraints to save some complexity and performance.

    Either way, be sure to LEFT [OUTER] JOIN users since your FK columns can obviously be NULL.

    Aside: With a (stricter) UNIQUE index on (lower(email)) in place, the additional UNIQUE CONSTRAINT on (email) seems redundant. Maybe you need the index on just (email)? Else, drop the constraint. Same for username.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2