Postgres "unnesting" multiple json arrays simultaneouesly with different depths



  • I am trying to explode/unnest a row of jsonb arrays into individual rows and keep any rows that would result in nulls.

    Fiddle: https://www.db-fiddle.com/f/pgd6E5yKf8NzPtgboeoC6c/7

    Input table: (I am stuck with the input being in this form)

    id name color price description
    1 ["Banana"] [["Green"]] [[[1]]] [[["Good"]]]
    2 ["Banana"] [["Yellow"]] [[[]]] [[[]]]
    3 ["Apple","Carrot"] [["Red"],["Orange"]] [[[10]],[[14]]] [[["Round"]],[["Long"]]]
    4 ["Grape","Pineapple"] [["Green"],["Brown","Pink"]] [[[2]],[[13],[29]]] [[["Small"]],[["Short"],["Tall"]]]
    5 ["Orange","Potato"] [["Orange"],["Purple","Brown"]] [[[2]],[[9],[]]] [[["Bright"]],[["Gross"],[]]]

    Desired Output:

    id name color price description
    1 Banana Green 1 Good
    2 Banana Yellow null null
    3 Apple Red 10 Round
    3 Carrot Orange 14 Long
    4 Grape Green 2 Small
    4 Pineapple Brown 13 Short
    4 Pineapple Pink 29 Tall
    5 Orange Orange 2 Bright
    5 Potato Purple 9 Gross
    5 Potato Brown null null

    I can get the output I want with this query, but I am trying to generate these queries using a templating engine and I'm trying to account for any number of elements being any depth of nested arrays. Trying to template this to handle that will be less than ideal.

    -- Gives expected results
    WITH first_level AS (
      SELECT
          id,
          jsonb_array_elements(name) AS name,
          jsonb_array_elements(color) AS color,
          jsonb_array_elements(price) AS price,
          jsonb_array_elements(description) AS description
      FROM food
    ),
    second_level AS (
    SELECT
          id,
          name,
          jsonb_array_elements(color) AS color,
          jsonb_array_elements(price) AS price,
          jsonb_array_elements(description) AS description
      FROM first_level
      )
    SELECT
          id,
          name,
          color,
          CASE
                WHEN price = '[]'::jsonb THEN null
                ELSE jsonb_array_elements(price)
          END AS price,
          CASE
                WHEN description = '[]'::jsonb THEN null
                ELSE jsonb_array_elements(description)
          END AS description
    FROM second_level
    

    I've tried unnesting them all at once but with pg 9.6 that gives back a bad result and doesnt have the nulls, It works better on pg10+ but it still drops the rows with empty arrays.

    Is there a cleaner way to achieve this result? I've run across examples of using lateral joins but none of them are exact matches and I cant figure them out to even get a result close to what I need.

    Thanks



  • Unnesting multiple set-returning functions in parallel behaved in mysterious ways (borderline broken) in Postgres 9.6. That got fixed with Postgres 10. See:

    • https://stackoverflow.com/a/39864815/939860

    The same fix also made set-returning functions in a CASE construct illegal. Your query only works on account of that superseded behaviour in Postgres 9.6 and breaks in later versions. Same as my simpler rCTE solution:

    db<>fiddle https://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=4d523ba7cedc7a660a1cc7697a2bae32

    You didn't specify what's allowed in your input format. Making some assumptions, this works in modern Postgres (and in Postgres 9.6, too):

    Custom function (accepts non-arrays, too):

    CREATE OR REPLACE FUNCTION f_jsonb_array_elements_if_any(jsonb)
      RETURNS SETOF jsonb
      LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE AS
    $func$
    BEGIN
       IF jsonb_typeof($1) = 'array' THEN 
          IF $1 = '[]' THEN
             RETURN NEXT jsonb 'null';
          ELSE
             RETURN QUERY SELECT jsonb_array_elements($1);
          END IF;
       ELSE
          RETURN NEXT $1;
       END IF;
    END
    $func$;
    

    Main query (with multiple advanced SQL features):

    WITH RECURSIVE cte AS (
       SELECT id, 1 AS lvl, ARRAY[ord] AS path, t.*
       FROM   food
       LEFT   JOIN LATERAL ROWS FROM (
          f_jsonb_array_elements_if_any(name)
        , f_jsonb_array_elements_if_any(color)
        , f_jsonb_array_elements_if_any(price)
        , f_jsonb_array_elements_if_any(description)
          ) WITH ORDINALITY t(n, c, p, d, ord) ON true
    

    UNION ALL
    SELECT id, lvl + 1, path || t.ord, t.*
    FROM cte c
    LEFT JOIN LATERAL ROWS FROM (
    f_jsonb_array_elements_if_any(n)
    , f_jsonb_array_elements_if_any(c)
    , f_jsonb_array_elements_if_any(p)
    , f_jsonb_array_elements_if_any(d)
    ) WITH ORDINALITY t(n, c, p, d, ord) ON true
    WHERE 'array' IN (jsonb_typeof(c.n), jsonb_typeof(c.c), jsonb_typeof(c.p), jsonb_typeof(c.d))
    )
    SELECT id, (first_value(n) OVER (PARTITION BY id, grp ORDER BY path))->>0 AS name
    , c->>0 AS color, p->>0 AS price, d->>0 AS description
    FROM (
    SELECT count(n) OVER (PARTITION BY id ORDER BY path) AS grp, *
    FROM cte c
    WHERE lvl = (SELECT max(lvl) FROM cte)
    ) sub;

    db<>fiddle https://dbfiddle.uk/?rdbms=postgres_14&fiddle=83a3320ef6a552ae829da724b18f0578

    Cycle with a recursive CTE, unnesting arrays until no array is left.

    In the outer SELECT only use rows from the lat iteration (max(lvl))

    Where a single element in name was combined with multiple in the other columns, we get genuine null values (as opposed to a jsonb 'null' from an empty array). Fill in the last actual value for the same group of rows.

    Finally extract text from the jsonb wrapper.

    This is one of the rare occasions where a https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-TABLEFUNCTIONS construct is instrumental. See:

    • https://stackoverflow.com/a/27854382/939860
    • https://stackoverflow.com/a/54798482/939860



Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2