Join variable number of rows in Postgres into json array



  • I have the following table:

    timestamp | type | value | source
    ----------+------+-------+--------
    2020-01-01| 1    | 10    | 2
    2020-01-01| 2    | 20    | 2
    2020-01-02| 1    | 5     | 4
    2020-01-02| 2    | 6     | 3
    2020-01-02| 5    | 4     | 3
    2020-01-02| 4    | 8     | 1
    

    Each timestamp + type pair is unique. For each timestamp, there can be a variable number of rows (but each with a distinct type). value and source are the actual data.

    Now I'd like to aggregate by timestamp into a json array where, for each timestamp, the available (type, value, source) tuples are included. Ideally, something like:

    [
      {
        "2020-01-01": [{"type": 1, "value": 10, "source": 2}, {"type": 2, "value": 20, "source": 2}]
      },
      {
        "2020-01-02": [{"type": 1, "value": 5, "source": 4}, {"type": 2, "value": 6, "source": 3}, {"type": 5, "value": 4, "source": 3}, {"type": 4, "value": 8, "source": 1}]
      }
    ]
    

    I don't really have strong preferences on the output format, as long as the information is there and grouped correctly, so for example this would be fine too:

    [
      {
        "timestamp": "2020-01-01", "data": [{"type": 1, "value": 10, "source": 2}, {"type": 2, "value": 20, "source": 2}]
      },
      {
        "timestamp": "2020-01-02", "data": [{"type": 1, "value": 5, "source": 4}, {"type": 2, "value": 6, "source": 3}, {"type": 5, "value": 4, "source": 3}, {"type": 4, "value": 8, "source": 1}]
      }
    ]
    

    Or this (which I guess would require casting type to string, which is not a problem):

    [
      {
        "timestamp": "2020-01-01", "1": {"value": 10, "source": 2}, "2": {"value": 20, "source": 2}
      },
      {
        "timestamp": "2020-01-02", "1": {"value": 5, "source": 4}, "2": {"value": 6, "source": 3}, "5": {"value": 4, "source": 3}, "4": {"value": 8, "source": 1}
      }
    ]
    

    Even this, as long as the field ordering is known in advance:

    [
      {
        "2020-01-01": [[1, 10, 2], [2, 20, 2]]
      },
      {
        "2020-01-02": [[1, 5, 4], [2, 6, 3], [5, 4, 3], [4, 8, 1]]
      }
    ]
    

    In short, whatever is easiest/most efficient.



  • WITH cte AS (
        SELECT "timestamp", json_agg(json_build_object('type', "type", 
                                                       'value', "value",
                                                       'source', "source")) intermediate_json
        FROM test
        GROUP BY 1
    )
    SELECT json_agg(json_build_object("timestamp", intermediate_json)) final_json
    FROM cte
    

    https://dbfiddle.uk/?rdbms=postgres_12&fiddle=2fc37233058437189b8e0a8eea05a01b




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2