design Postgres database schema for arbitrary JSON attribute queries



  • In my Postgres database, I have got lots of data (a million rows) and the schema looks like this:

    create table test_tb (
      id int,
      tags jsonb
    );
    

    A few queries to be performed on this dataset are:

    1.

    select id, tags ->> 'tag_1' as tag_1
    from test_tb
    where tags ->> 'tag_1'= 'val_1'
    
    select id, tags ->> 'tag_2' as tag_2
    from test_tb
    
    select id, tags ->> 'tag_1'
    from test_tb
    WHERE (tags ->> 'tag_1'='dv_0'
      AND tags ->> 'tag_3'='dv_15')
      AND (tags ->> 'tag_15'='dv_22' OR tags ->> 'tag_5'='dv_6')
       OR (tags ->> 'tag_12'='dv_9'); 
    

    Now the tags in each tuple are completely arbitrary, one tag might be appearing in only one tuple while another in hundreds of them, the number of tags in each tuple is around 20 - 30.

    I tried storing tags in a jsonb column and put GIN indexing on that column but it didn't optimize my queries. Please suggest some alternate schema.



  • You'll have to rewrite the queries so that they can use the GIN index. FOr the first query, that would be:

    SELECT id, tags ->> 'tag_1' as tag_1
    FROM test_tb
    WHERE tags @> '{ "tag_1": "val_1" }';
    

    For the second query, no index can help, since there is no WHERE or ORDER BY clause.

    The third query is tricky, because it contains OR, but since the result set contains the primary key, you can rewrite it to use the index:

       SELECT id, tags ->> 'tag_1'
       FROM test_tb
       WHERE tags @> '{ "tag_1": "dv_0", "tag_3": "dv_15", "tag_15": "dv_22" }'
    UNION
       SELECT id, tags ->> 'tag_1'
       FROM test_tb
       WHERE tags @> '{ "tag_1": "dv_0", "tag_3": "dv_15", "tag_5": "dv_6" }'
    UNION
       SELECT id, tags ->> 'tag_1'
       FROM test_tb
       WHERE tags @> '{ "tag_12": "dv_9" }';
    

    Here, UNION is used instead of the OR. The query becomes longer, but each of the subqueries can use the index.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2