Aggregate data with multi-schema tenant architecture
We have an application that uses the tenant schema architecture, one database, has multiple schema per "tenant", each tenant is a replica with same tables etc.
Sometimes we don't know the tenant or have to make global stats, in both case we need the aggregate of a table from each tenant.
The solution is a stored procedure that will create a view, the view is generated by looping over the tenant names, querying from it and using an union.
The view's definition ends up being something like : select * from tenant1.table union all ... select * from tenant2.table etc.
The stored procedure is ran every night in case a new tenant is added to update the view's declaration.
First request on the view is always slow (more than 1 minute) but then cache hits and it takes 3 seconds.
Optimizations have been done on the where clauses of the request by adding indexes, as well as union all instead of using union since duplicates are impossible.
I was wondering if there were better ways, performance-wise to do multi-tenant data aggregates ?
Database => tenant1_schema => actors, table2, ... tenant2_schema => actors, table2, ... ... public_schema => aggregateview
Create a partitioned table and define the tenant's tables as partitions of that table. Then querying that partitioned table will return data for all tenants.