Aggregate data with multi-schema tenant architecture



  • We have an application that uses the tenant schema architecture, one database, has multiple schema per "tenant", each tenant is a replica with same tables etc.

    Sometimes we don't know the tenant or have to make global stats, in both case we need the aggregate of a table from each tenant.

    The solution is a stored procedure that will create a view, the view is generated by looping over the tenant names, querying from it and using an union.

    The view's definition ends up being something like : select * from tenant1.table union all ... select * from tenant2.table etc.

    The stored procedure is ran every night in case a new tenant is added to update the view's declaration.

    First request on the view is always slow (more than 1 minute) but then cache hits and it takes 3 seconds.

    Optimizations have been done on the where clauses of the request by adding indexes, as well as union all instead of using union since duplicates are impossible.

    I was wondering if there were better ways, performance-wise to do multi-tenant data aggregates ?

    Database => tenant1_schema => actors, table2, ...
                tenant2_schema => actors, table2, ...
                ...
                public_schema  => aggregateview
    


  • Create a partitioned table and define the tenant's tables as partitions of that table. Then querying that partitioned table will return data for all tenants.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2