Optimize select query with OR and ILIKE



  • I have the following query for my PostgreSQL 13 database:

    SELECT * FROM "user" 
    INNER JOIN "comment" on "comment"."user_id" = "user"."id"
    WHERE "user"."biography" ILIKE '%SomeString%' 
       OR "comment"."text" ILIKE '%SomeString%';
    

    I have a GIN index defined on user biography and another one on comment text

    CREATE INDEX user_bio_idx ON "user" USING gin ("biography" gin_trgm_ops);
    CREATE INDEX comment_txt_idx ON "comment" USING gin ("text" gin_trgm_ops);
    

    Unfortunately, running EXPLAIN ANALYZE ... on my select query reveals that the indexes are not being used and the query is taking 4 to 5 seconds.

    How can I improve its performance? (I already tried UNION and even concatenating the two columns)



  • LIKE Operator with leading '%'

    Indexes cannot be used when a leading '%' is used in a LIKE operator. Picture the common lay out of an Index, it's to put a specific column in order to make SEEKS fast. For example, take your username "Kamal Karain". If I were to look for your name in the phone book (I know, what are phone books?), I could flip straight to the Ks and find your name easily. Now, if I had to search the phone book for '%amal Karain', that might take a bit longer, as now I have no idea where to look in the phone book, as the first letter or letters of your name could be anything.

    Consider using LIKE operators with only trailing '%'. This will allow you to use the index, and hopefully get close to where the data is that you're looking for.

    OR Operators

    Now, OR operators introduce an entirely different issue, and often cause indexes to be ignored, as well. In your case, you have two index. However, the optimizer would prefer to use just one. With the two you have, it would have to decide, is it going to SEEK one index to find column A, and then SEEK a different index to find column B. Or instead, decide it's easier to do a single SCAN of the main table, looking for any row where at least one of those columns equates to true.

    The way around this is to use a UNION and split the OR into two queries. Then each can use the INDEX you've created on those columns.

    For example, try a query like this, and see how it performs.

    SELECT *
    FROM "user" 
    INNER JOIN "comment"
        ON "comment"."user_id" = "user"."id"
    WHERE "user"."biography" ILIKE 'SomeString%'
    

    UNION

    SELECT *
    FROM "user"
    INNER JOIN "comment"
    ON "comment"."user_id" = "user"."id"
    WHERE "comment"."text" ILIKE 'SomeString%';




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2