For each day, get query row count for that day



  • Consider a PostgreSQL table (pseudo-sql):

    CREATE TABLE events (
      time TIMESTAMPTZ,
      success BOOLEAN,
      foreign_id FOREIGN KEY,
      …
    );
    

    https://dbfiddle.uk/?rdbms=postgres_13&fiddle=8f5f3ad39dfd8df808339781ec2544c7

    I can run a query to get the last event for each foreign_id to essentially get its current status. I can therefore see that at NOW() there are N foreign_ids in the success state. AFAIK this involves applying a ROW_NUMBER() over a window grouped by foreign_id then filtering for row_number = 1, or SELECT DISTINCT ON (foreign_id) … ORDER BY time DESC.

    What I would like is the total count of foreign_ids in "success" state for each day in the past x days.

    So for the count for 3 days ago, I would only consider events rows with a time up until 3 days ago, then take the latest/final row for each foreign_id to get its current status 3 days ago, then of all of those rows, count how many were in success state.

    The key thing is that this count would go up and down over time, for example maybe 3 days ago there were 10 foreign_ids in success state but right now there are only 4. I would like to capture this.

    So ideally I get back something that looks like:

    SELECT "day", foreign_ids_in_success_state …
    

    With data that looks something like this:

    +-----+------------------------------------------------------------+
    | Day | Count of Events with 'successful' status up until that day |
    +-----+------------------------------------------------------------+
    | 1   |                             10                             |
    +-----+------------------------------------------------------------+
    | 2   |                              5                             |
    +-----+------------------------------------------------------------+
    | 3   |                              7                             |
    +-----+------------------------------------------------------------+
    | 4   |                              2                             |
    +-----+------------------------------------------------------------+
    

    So what I want is:

    1. for each day irrespective of whether there are events on that day or not
    2. get the latest event for each foreign_id up until that day, e.g. for the day 3 days ago, pretend NOW() is 3 days ago, then get what would be the latest event for each foreign_id 3 days ago, i.e. disregard any rows newer than 3 days ago
    3. of all of those events, count how many were success IS TRUE

    I believe some CROSS JOINs with generate_series() may be required for this kind of thing but I'm having trouble reasoning about it.

    If this kind of thing has a name or relevant terms, I would appreciate learning it/them since I feel like it might not be a unique/rare situation.



  • Would this be what you're looking for?

    SELECT date_trunc('day', ts) AS "day", COUNT(*)
    FROM events
    WHERE success IS TRUE
    GROUP BY date_trunc('day', ts)
    ORDER BY Day DESC
    

    https://dbfiddle.uk/?rdbms=postgres_13&fiddle=e09367af5bb4a1e4fef623df0cc5b35c


Log in to reply
 


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2