Postgres "unnesting" multiple json arrays simultaneouesly with different depths
-
I am trying to explode/unnest a row of jsonb arrays into individual rows and keep any rows that would result in nulls.
Fiddle: https://www.db-fiddle.com/f/pgd6E5yKf8NzPtgboeoC6c/7
Input table: (I am stuck with the input being in this form)
id name color price description 1 ["Banana"] [["Green"]] [[[1]]] [[["Good"]]] 2 ["Banana"] [["Yellow"]] [[[]]] [[[]]] 3 ["Apple","Carrot"] [["Red"],["Orange"]] [[[10]],[[14]]] [[["Round"]],[["Long"]]] 4 ["Grape","Pineapple"] [["Green"],["Brown","Pink"]] [[[2]],[[13],[29]]] [[["Small"]],[["Short"],["Tall"]]] 5 ["Orange","Potato"] [["Orange"],["Purple","Brown"]] [[[2]],[[9],[]]] [[["Bright"]],[["Gross"],[]]] Desired Output:
id name color price description 1 Banana Green 1 Good 2 Banana Yellow null null 3 Apple Red 10 Round 3 Carrot Orange 14 Long 4 Grape Green 2 Small 4 Pineapple Brown 13 Short 4 Pineapple Pink 29 Tall 5 Orange Orange 2 Bright 5 Potato Purple 9 Gross 5 Potato Brown null null I can get the output I want with this query, but I am trying to generate these queries using a templating engine and I'm trying to account for any number of elements being any depth of nested arrays. Trying to template this to handle that will be less than ideal.
-- Gives expected results WITH first_level AS ( SELECT id, jsonb_array_elements(name) AS name, jsonb_array_elements(color) AS color, jsonb_array_elements(price) AS price, jsonb_array_elements(description) AS description FROM food ), second_level AS ( SELECT id, name, jsonb_array_elements(color) AS color, jsonb_array_elements(price) AS price, jsonb_array_elements(description) AS description FROM first_level ) SELECT id, name, color, CASE WHEN price = '[]'::jsonb THEN null ELSE jsonb_array_elements(price) END AS price, CASE WHEN description = '[]'::jsonb THEN null ELSE jsonb_array_elements(description) END AS description FROM second_level
I've tried unnesting them all at once but with pg 9.6 that gives back a bad result and doesnt have the nulls, It works better on pg10+ but it still drops the rows with empty arrays.
Is there a cleaner way to achieve this result? I've run across examples of using lateral joins but none of them are exact matches and I cant figure them out to even get a result close to what I need.
Thanks
-
Unnesting multiple set-returning functions in parallel behaved in mysterious ways (borderline broken) in Postgres 9.6. That got fixed with Postgres 10. See:
- https://stackoverflow.com/a/39864815/939860
The same fix also made set-returning functions in a
CASE
construct illegal. Your query only works on account of that superseded behaviour in Postgres 9.6 and breaks in later versions. Same as my simpler rCTE solution:db<>fiddle https://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=4d523ba7cedc7a660a1cc7697a2bae32
You didn't specify what's allowed in your input format. Making some assumptions, this works in modern Postgres (and in Postgres 9.6, too):
Custom function (accepts non-arrays, too):
CREATE OR REPLACE FUNCTION f_jsonb_array_elements_if_any(jsonb) RETURNS SETOF jsonb LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE AS $func$ BEGIN IF jsonb_typeof($1) = 'array' THEN IF $1 = '[]' THEN RETURN NEXT jsonb 'null'; ELSE RETURN QUERY SELECT jsonb_array_elements($1); END IF; ELSE RETURN NEXT $1; END IF; END $func$;
Main query (with multiple advanced SQL features):
WITH RECURSIVE cte AS ( SELECT id, 1 AS lvl, ARRAY[ord] AS path, t.* FROM food LEFT JOIN LATERAL ROWS FROM ( f_jsonb_array_elements_if_any(name) , f_jsonb_array_elements_if_any(color) , f_jsonb_array_elements_if_any(price) , f_jsonb_array_elements_if_any(description) ) WITH ORDINALITY t(n, c, p, d, ord) ON true
UNION ALL
SELECT id, lvl + 1, path || t.ord, t.*
FROM cte c
LEFT JOIN LATERAL ROWS FROM (
f_jsonb_array_elements_if_any(n)
, f_jsonb_array_elements_if_any(c)
, f_jsonb_array_elements_if_any(p)
, f_jsonb_array_elements_if_any(d)
) WITH ORDINALITY t(n, c, p, d, ord) ON true
WHERE 'array' IN (jsonb_typeof(c.n), jsonb_typeof(c.c), jsonb_typeof(c.p), jsonb_typeof(c.d))
)
SELECT id, (first_value(n) OVER (PARTITION BY id, grp ORDER BY path))->>0 AS name
, c->>0 AS color, p->>0 AS price, d->>0 AS description
FROM (
SELECT count(n) OVER (PARTITION BY id ORDER BY path) AS grp, *
FROM cte c
WHERE lvl = (SELECT max(lvl) FROM cte)
) sub;
db<>fiddle https://dbfiddle.uk/?rdbms=postgres_14&fiddle=83a3320ef6a552ae829da724b18f0578
Cycle with a recursive CTE, unnesting arrays until no array is left.
In the outer
SELECT
only use rows from the lat iteration (max(lvl)
)Where a single element in
name
was combined with multiple in the other columns, we get genuinenull
values (as opposed to ajsonb 'null'
from an empty array). Fill in the last actual value for the same group of rows.Finally extract
text
from thejsonb
wrapper.This is one of the rare occasions where a https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-TABLEFUNCTIONS construct is instrumental. See:
- https://stackoverflow.com/a/27854382/939860
- https://stackoverflow.com/a/54798482/939860