Query result precedence if filter columns have values



  • I have a table "customer_config" with these columns:

    company (varchar)
    warehouse (numeric)
    section (numeric)
    config_keyword (varchar)
    config_value (varchar)
    

    The two config_* columns can apply to an entire company (warehouse and section are null), an entire warehouse within a company (section is null), or a section within a warehouse.

    So we could have a default row for the company, and then one or more rows that override a configuration value for a specific warehouse or warehouse & section.

    I want to return only the most specific row for a given company, warehouse, and section. Something like this pseudocode:

    results = select * from customer_config where (all match)
    if results empty
        results = select * from customer_config where (company_code and warehouse match)
    if results empty
        results = select * from customer_config where (company_code matches)
    

    The most specific row shall take precedence.

    Update

    There can be multiple entries for the same config_keyword on the same level.
    Is it also possible to return multiple rows for a single keyword?



  • "Only the most specific row"

    While looking for a single result row (like your original question indicated):

    This is a bit more verbose, but very clear and probably fastest if supported with an index on (company_code, warehouse, section) - like you should probably have anyway (depends on undisclosed info).

    SELECT * FROM customer_config
    WHERE  (company_code, warehouse, section) = ($1, $2, $3)
    

    UNION ALL
    SELECT * FROM customer_config
    WHERE (company_code, warehouse) = ($1, $2)

    UNION ALL
    SELECT * FROM customer_config
    WHERE company_code = $1
    LIMIT 1;

    Postgres will stop executing as soon as a row has been found. Test with EXPLAIN ANALYZE, you'll see "never executed" for remaining sub-SELECTs.

    Note that LIMIT 1 applies to the whole query, not the last SELECT. (You'd need parentheses to change that.) Similar:

    • https://dba.stackexchange.com/questions/184287/how-to-force-order-of-where-clauses-in-postgresql/184309#184309

    "All rows for the most specific match"

    If multiple rows can exist on each level.

    Can be solved with pure SQL of course. For instance with a chain of CTEs. But this custom PL/pgSQL function should be more efficient:

    CREATE OR REPLACE FUNCTION trade_volume (_company_code varchar, _warehouse numeric, _section numeric)
      RETURNS SETOF customer_config
      LANGUAGE plpgsql STABLE PARALLEL SAFE AS
    $func$
    BEGIN
       RETURN QUERY
       SELECT * FROM customer_config
       WHERE (company_code, warehouse, section) = ($1, $2, $3);
    

    IF FOUND THEN RETURN; END IF;

    RETURN QUERY
    SELECT * FROM customer_config
    WHERE (company_code, warehouse) = ($1, $2);

    IF FOUND THEN RETURN; END IF;

    RETURN QUERY
    SELECT * FROM customer_config
    WHERE company_code = $1;
    END
    $func$;

    Call:

    SELECT * FROM trade_volume ('my_comany_code', 123456, 123);
    

    Be sure to have the index mentioned above.

    If the first query returns any rows, the function is done. The rest is not even planned. Etc.

    Related:

    • https://stackoverflow.com/a/50380486/939860
    • https://dba.stackexchange.com/questions/76744/writing-a-view-that-caches-to-another-table/76750#76750

    I made the function PARALLEL SAFE to allow parallelism in Postgres 14 or later. (Only relevant for big tables.) Quoting the https://www.postgresql.org/docs/current/release-14.html#id-1.11.6.7.5.9 :

    Allow plpgsql's RETURN QUERY to execute its query using parallelism (Tom Lane)




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2