R
There are a couple of indexes that might help a query of your kind. Like, a btree index with inverted sort order (possibly multicolumn, or "covering"). See:
https://dba.stackexchange.com/questions/39589/optimizing-queries-on-a-range-of-timestamps-two-columns/39599#39599
Or a GiST or SP-GiST index on daterange. See:
https://stackoverflow.com/a/22111524/939860
However, while your filters are hardly selective, indexes can only do so much. What makes your query expensive is "unnesting" rows with long time ranges to many rows before folding duplicate days with DISTINCT and counting. That can be improved gradually.
Faster EXISTS variant
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e81678ab24633937116d87d3a40f2e5a has an expensive flaw. You still COUNT(DISTINCT dates.day), but there's no need any more as EXISTS already eliminated duplicates. count(*) is substantially faster. Plus some other minor modifications:
WITH dates AS (
SELECT day FROM calendar
WHERE day BETWEEN '2021-04-14' AND '2022-04-13'
)
SELECT p.product_id, count(*) AS days
FROM dates d
CROSS JOIN products p
WHERE EXISTS (
SELECT FROM sales_periods s
WHERE s.product_id = p.product_id
AND d.day >= s.since
AND d.day
db<>fiddle https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e1dd0550ead55a4450a8bb953219d51a
Pre-select rows from sales_periods
Another possible optimization, especially while indexes on the main table don't help anyway, or for more selective filter ranges: eliminate irrelevant sales early:
WITH s AS (
SELECT product_id, since, till
FROM sales_periods
WHERE since = '2021-04-14'
)
SELECT p.product_id, count(*) AS days
FROM (
SELECT day FROM calendar
WHERE day BETWEEN '2021-04-14' AND '2022-04-13'
) d
CROSS JOIN products p
WHERE EXISTS (
SELECT FROM s
WHERE s.product_id = p.product_id
AND d.day >= s.since
AND d.day
db<>fiddle https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e1dd0550ead55a4450a8bb953219d51a
Aside:
Your calendar table holds days till the year 2222, which seems like excessive waste. There is no need to handle a table of 81083 rows. But that's a minor issue.
Much faster with range aggregation
What I really want to post is the following query making use of range aggregation, and then count the days in the range. 200x faster in my hands with your sample data (~ 3 ms vs ~ 650 ms for your original query). But there's a snag: we need multiranges introduced with Postgres 14.
Create this auxiliary function to count days in a datemultirange:
CREATE OR REPLACE FUNCTION f_days_in_multirange(datemultirange)
RETURNS int
LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE
BEGIN ATOMIC
SELECT sum(upper(r) - lower(r))::int
FROM unnest($1) t(r);
END;
COMMENT ON FUNCTION f_days_in_multirange(datemultirange) IS 'Counts days in given datemultirange.
Input with any unbounded range results in NULL value!';
See:
https://dba.stackexchange.com/questions/300507/what-does-begin-atomic-end-mean-in-a-postgresql-sql-function-procedure/300512#300512
Then the query can be:
SELECT product_id
, CASE WHEN sales_ct = 1 THEN upper(date_range) - lower(date_range)
ELSE f_days_in_multirange(date_range) END AS days
FROM (
SELECT product_id, count(*) AS sales_ct
, range_agg(daterange(since, till, '[]'))
* multirange(daterange('2021-04-14', '2022-04-13', '[]')) AS date_range
FROM sales_periods
WHERE since = '2021-04-14'
GROUP BY 1
) sub;
db<>fiddle https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e1dd0550ead55a4450a8bb953219d51a
Most of your resulting date ranges consist of a single range, and most of those come from a single source. So I added a cheap count sales_ct and used that to take a shortcut to great effect (almost 10x). CASE WHEN sales_ct = 1 THEN .... Depending on your actual data distribution, other shortcuts may be possbile.
You need to understand https://www.postgresql.org/docs/current/rangetypes.html#RANGETYPES-BUILTIN , https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE , and the https://www.postgresql.org/docs/current/functions-range.html#RANGE-OPERATORS-TABLE to compute the intersection of the ranges.