Why is the query planner only using one column of two in the index?



  • I have a table on MySQL 5.7 with the following definition and indexes:

    +-------------------------+---------------+------+-----+---------+----------------+
    | Field                   | Type          | Null | Key | Default | Extra          |
    +-------------------------+---------------+------+-----+---------+----------------+
    | id                      | int(11)       | NO   | PRI | NULL    | auto_increment |
    | col1                    | int(11)       | YES  | MUL | NULL    |                |
    | created_at              | datetime      | YES  |     | NULL    |                |
    | archive                 | int(11)       | YES  | MUL | 0       |                |
    ...
    +-------------------------+---------------+------+-----+---------+----------------+
    

    +-------------+------------+------------------------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
    +-------------+------------+------------------------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | my_table | 0 | PRIMARY | 1 | id | A | 1086069 | NULL | NULL | | BTREE | | |
    | my_table | 1 | indx1 | 1 | archive | A | 1 | NULL | NULL | YES | BTREE | | |
    | my_table | 1 | indx2 | 1 | archive | A | 1 | NULL | NULL | YES | BTREE | | |
    | my_table | 1 | indx2 | 2 | created_at | A | 1086069 | NULL | NULL | YES | BTREE | | |
    +-------------+------------+------------------------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

    When I run EXPLAIN SELECT * FROM my_table r WHERE r.col1 IN(1,4,6,10) AND (archive = 0) ORDER BY r.created_at DESC LIMIT 0,1001 I get:

    +----+-------------+-------+------------+------+----------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------+---------+-------+--------+----------+-------------+
    | id | select_type | table | partitions | type | possible_keys                                                                                                                                      | key                                | key_len | ref   | rows   | filtered | Extra       |
    +----+-------------+-------+------------+------+----------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------+---------+-------+--------+----------+-------------+
    |  1 | SIMPLE      | r     | NULL       | ref  | ...                                                                                                                                                | indx2                              | 5       | const | 158522 |    50.96 | Using where |
    +----+-------------+-------+------------+------+----------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------+---------+-------+--------+----------+-------------+
    

    When I run EXPLAIN SELECT * FROM my_table r USE INDEX(indx1) WHERE r.col1 IN(1,4,6,10) AND (archive = 0) ORDER BY r.created_at DESC LIMIT 0,1001 I get:

    +----+-------------+-------+------------+------+-------------------------+-------------------------+---------+-------+--------+----------+----------------------------------------------------+
    | id | select_type | table | partitions | type | possible_keys           | key                     | key_len | ref   | rows   | filtered | Extra                                              |
    +----+-------------+-------+------------+------+-------------------------+-------------------------+---------+-------+--------+----------+----------------------------------------------------+
    |  1 | SIMPLE      | r     | NULL       | ref  | ...                     | indx1                   | 5       | const | 158522 |    40.00 | Using index condition; Using where; Using filesort |
    +----+-------------+-------+------------+------+-------------------------+-------------------------+---------+-------+--------+----------+----------------------------------------------------+
    

    It seems that in the first case, when not using the index hint, the query planner is not using created_at in the index because of the DESC clause, which MySQL 5.7 doesn't support. But why is it that in the extra column I see Using where when using indx2 and Using index condition; Using where; Using filesort when using indx1? The query also runs twice as fast. Why is that?

    Edit: Adding SHOW CREATE TABLE output.

    CREATE TABLE `my_table` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `col1` int(11) DEFAULT NULL,
      `created_at` datetime DEFAULT NULL,
      ...
      `archive` int(11) DEFAULT '0',
      PRIMARY KEY (`id`),
      KEY `indx1` (`archive`),
      KEY `indx2` (`archive`,`created_at`)
    ) ENGINE=InnoDB AUTO_INCREMENT=2298216 DEFAULT CHARSET=utf8
    


    • Please provide SHOW CREATE TABLE; it is more descriptive than DESCRIBE.
    • Add INDEX(archive, col1), it may help performance.
    • key_len=5 for a 4-byte INT means that it is NULLable; nothing else. (key_len provides no clue of whether any columns are used for ORDER BY.)
    • Which rows really need to be NULLable? (Change to NOT NULL where appropriate.)
    • "when using indx2 the DB uses only the first column" -- Maybe, maybe not. It does use only the first column for the WHERE, but notice that it does not say filesort, so it used the second column to avoid the sort.
    • If you change the IN list, the query may run faster or slower, depending on how fast it finds 1001 relevant rows.
    • The "filesort" is fast because it is sorting only 1001 rows, not 158K rows. And it will probably do it in RAM. ("FILEsort" is a misnomer.)
    • 5.7 does not have DESC indexes; all BTrees are stored ASCENDING. But it can, in some cases, use the INDEX even when you have DESC in the ORDER BY. (Including all the cases in your Question and my Answer.)



Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2