What is the performance overhead of IFNULL in the select statement of mysql 5.7+ query



  • I've recently made an upgrade to a mysql query performed by my web server when serving request. There are two main permutations of the query, one will typically be limited to returning 500 rows and another will be limited to a particular value of a column (and is properly indexed) resulting in 5-10k results.

    The original query looks like this...

    SELECT v1, v2, v3 FROM t1;
    

    The new query looks like the following.

    SELECT v1, v2, v3, IFNULL(t2.u1, 0), IFNULL(t2.u2, '')
        FROM t1
        LEFT JOIN t2 ON t1.id = t2.id;
    

    Changing the query lead to maxing out my database connections and subsequently timing out most of my web server requests.

    Also note, t2 is currently empty, so every IFNULL should be resolved as the second parameter.

    The main question(s) a.) is IFNULL expensive b.) does it somehow prevent efficient caching of queries?



  • Let's look at this list of overheads:

    • Overhead for a statement
    • Overhead for a row
    • Overhead for a function

    Those are in order starting with the most overhead. That is, the function overhead is insignificant in the big picture; don't worry about it.

    See also COALESCE().

    Caching

    With Engine=InnoDB, the main cache is the "buffer_pool"; innodb_buffer_pool_size should be about 70% of available RAM. All actions on rows happen in the buffer_pool:

    1. Read block(s) from disk (if not already in cache)
    2. Find the row(s) in the block(s)
    3. Fetch (SELECT) or modify row(s)
    4. Eventually write the block back to disk (if modified)

    Since you are asking about SELECT, only steps 1,2,3 are relevant. Furthermore, if you keep Selecting the same row (or nearby rows), step 1 is not repeated.

    Note: This does not mention any impact of function calls.

    Note: On a cold system (nothing in the cache), the first SELECT will be slower due to step 1. After that the same or 'similar' queries will be faster (sometimes 10x faster).

    Comment: ON t1.id = t2.id Do you have two tables with identical PRIMARY KEYS? Why?

    The main way to have good performance in a JOIN is to have suitable indexes. Let's see the real query and SHOW CREATE TABLE for the tables being Joined.

    The BNL mentioned in a comment is an internal "caching" to speed up certain Joins. You have little control over its use. It refers to fetching all the stuff needed from a secondary table and storing it in an in-memory hash. It only applies when Joining. Function calls are not involved (in your example).

    If a function (eg, IFNULL) were used in ON or WHERE, it is likely to prevent the use of an otherwise suitable Index. (cf not "sargable")




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2