Do datatypes of columns affect query performance?


  • QA Engineer

    Let's suppose I have this table:

    Table "public.orders"
             Column      |     Type      | Collation | Nullable | Default 
        -----------------+---------------+-----------+----------+---------
         o_orderkey      | integer       |           | not null | 
         o_custkey       | integer       |           |          | 
         o_orderstatus   | character(1)  |           |          | 
         o_totalprice    | numeric(12,2) |           |          | 
         o_orderdate     | date          |           |          | 
         o_orderpriority | character(15) |           |          | 
         o_clerk         | character(15) |           |          | 
         o_shippriority  | integer       |           |          | 
         o_comment       | character(79) |           |          |
    

    If I have queries involving o_orderstatus, o_orderpriority, o_clerk or o_comment columns, can I change char(n) datatype to text in order to improve them?



  • Yes. And probably a lot.

    You never gain performance using character(n)(alias char(n)). You don't gain anything at all because that type is outdated, mostly useless, and https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_char.28n.29 . Related:

    • https://dba.stackexchange.com/questions/89429/would-index-lookup-be-noticeably-faster-with-char-vs-varchar-when-all-values-are/89433#89433

    But you saw that bit in the quote from https://www.postgresql.org/docs/current/datatype-character.html where it says:

    apart from increased storage space when using the blank-padded type

    char(n) is the blank-padded type. And column names like comment indicate mostly blank-padded waste. text or varchar remove the bloat and everything around your table becomes substantially faster, as your avg. row size probably shrinks to less than half. More tuples per data page means fewer pages to read and process per query, and that is the most important factor for performance there is.

    While being at it, order columns favorably to make it more efficient, yet:

       Column      |     Type      | Collation | Nullable | Default 
    ---------------+---------------+-----------+----------+---------
     orderkey      | integer       |           | not null | 
     custkey       | integer       |           |          | 
     orderdate     | date          |           |          | 
     shippriority  | integer       |           |          | 
     totalprice    | numeric(12,2) |           |          | 
     orderstatus   | varchar(1)    |           |          | 
     orderpriority | varchar(15)   |           |          | 
     clerk         | varchar(15)   |           |          | 
     comment       | varchar(79)   |           |          |
    

    Why?

    • https://stackoverflow.com/a/7431468/939860

    I kept the length restrictions with varchar(n), but if those are just arbitrary, use text instead. A tiny bit faster, yet. (And less corner-case hassle.) See:

    • https://dba.stackexchange.com/questions/20974/should-i-add-an-arbitrary-length-limit-to-varchar-columns/21496#21496



Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2