掘金 后端 ( ) • 2024-06-10 23:37

highlight: a11y-dark

傻笑的小埋


前言

使用MySQL时,明明数据没有很多,明明条件列添加了索引,但是为啥查询就越来越慢了呢。通常这就是出现了索引失效,为了让大家更能理解索引失效的场景,本文将基于一个简单的例子,对可能出现索引失效的场景进行分析,并给出相应解决方案,也许这才是面试官期望听到的索引失效

MySQL版本:8.4.0

正文

一. 表数据准备

创建~老朋友~student表。

CREATE TABLE student (
  id INT PRIMARY KEY auto_increment,
  stu_name VARCHAR ( 255 ) NOT NULL,
  stu_age INT NOT NULL,
  stu_phone VARCHAR ( 20 ) DEFAULT NULL ,
  stu_sex VARCHAR( 20 ) NOT NULL,
  stu_des VARCHAR( 255 ) NOT NULL
);

插入若干条数据。

INSERT INTO student ( stu_name, stu_age, stu_phone, stu_sex, stu_des ) VALUES ( "Bill", 20, "18888888888", "male", "Able to endure hardship" );
INSERT INTO student ( stu_name, stu_age, stu_phone, stu_sex, stu_des ) VALUES ( "Bob", 25, "17777777777", "male", "There will be endless bitterness to eat" );
INSERT INTO student ( stu_name, stu_age, stu_phone, stu_sex, stu_des ) VALUES ( "Mary", 21, "15555555555", "female", "Able to endure hardship" );
INSERT INTO student ( stu_name, stu_age, stu_phone, stu_sex, stu_des ) VALUES ( "Bill", 22, "13333333333", "male", "There will be endless bitterness to eat" );

二. 函数计算导致索引失效

在一开始student表除了主键索引外没有其它任何索引,所以先执行下面语句为stu_name列添加一个普通索引,如下所示。

ALTER TABLE student ADD INDEX name_index ( stu_name );

此时我们再执行如下查询语句。

SELECT
  stu_name,
  stu_age,
  stu_phone 
FROM
  student 
WHERE
  SUBSTR( stu_name, 1, 1 ) = "B";

解释一下该条SQL语句,解释结果如下所示。

+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | student | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |   100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+

明明在stu_name列添加了索引,但并没有使用索引,这就是因为在索引列上进行了函数计算,导致MySQL无法识别索引列,最终就不会去使用索引。

要解决这个问题也很简单,我们可以为stu_name列添加如下函数索引。

ALTER TABLE student ADD INDEX name_func_index ( (SUBSTR( stu_name, 1, 1 )) );

重新解释一下查询的SQL语句,结果如下所示。

+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys   | key             | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | student | NULL       | ref  | name_func_index | name_func_index | 6       | const |    3 |   100.00 | NULL  |
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+

在生成的执行计划中,显示MySQL决定使用name_func_index索引。

三. 隐式转换导致索引失效

有如下SQL语句。

SELECT
  stu_name,
  stu_age,
  stu_phone 
FROM
  student 
WHERE
  stu_name = 555;

注意到stu_name列类型是varchar,但是我们在进行条件查询时,将stu_name与数字进行了比较,此时解释一下上述SQL语句,结果如下。

+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | student | NULL       | ALL  | name_index    | NULL | NULL    | NULL |    4 |    25.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+

明明stu_name列是有索引的,但最终的执行计划中,并没有选择使用索引。这个现象就是因为发生了隐式转换,即索引列与查询条件类型不一致时MySQL自动为我们做了类型转换,这会导致使用索引的效率下降,从而最终在执行计划中可能会不使用索引。

要解决也很简单,就是查询条件类型与索引列保持一致即可。我们现在将查询SQL修改如下。

SELECT
  stu_name,
  stu_age,
  stu_phone 
FROM
  student 
WHERE
  stu_name = "555";

解释上述查询语句,结果如下。

+----+-------------+---------+------------+------+---------------+------------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key        | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | student | NULL       | ref  | name_index    | name_index | 767     | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+------------+---------+-------+------+----------+-------+

此时因为没有发生隐式转换,所以执行计划使用了索引。

四. 最左匹配原则造成索引失效

在使用组合索引时,如果查询条件不满足从索引的最左边开始且不跳过索引中的列时,索引会失效,这就是我们背得较多的一种索引失效场景。

但需要注意,如果是MySQL8.x版本,因为MySQL引入了索引跳跃扫描这个功能,所以就算不满足最左匹配原则,索引也不一定会失效,具体还是要看MySQL优化器的最终选择。

现在为student表的stu_phonestu_sexstu_des列添加组合索引,语句如下。

ALTER TABLE student ADD INDEX phone_sex_des_index ( stu_phone, stu_sex, stu_des );

现在执行下面一条查询语句,如下所示。

SELECT
  stu_name,
  stu_age,
  stu_phone 
FROM
  student 
WHERE
  stu_des = "Able to endure hardship";

不用关心执行结果,下面解释一下这条SQL语句,结果如下。

+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | student | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    25.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+

可见没有使用索引,这里有两点需要关注。

  1. 不满足覆盖索引。查询的字段里面有stu_namestu_age,这两个字段不在组合索引phone_sex_des_index中,因此如果使用组合索引,需要回表;
  2. 不满足最左匹配原则。查询条件里面仅有stu_des,对于组合索引phone_sex_des_index来说,违反了最左匹配原则

因为MySQL8.x版本引入了索引跳跃扫描,该机制让在组合索引场景下,就算不满足最左匹配原则,也可以使用组合索引,但是该机制相较于符合最左匹配原则时存在一定的性能开销,所以MySQL优化器对如下两种情况的开销做了判断。

  1. 强行使用组合索引索引跳跃扫描开销 + 回表开销
  2. 全表扫描全表扫描开销

最终判断出来是情况2开销小于情况1,故进行了全表扫描而没有使用组合索引。

现在再看一下下面这条语句。

SELECT
  stu_name,
  stu_age,
  stu_phone 
FROM
  student 
WHERE
  stu_phone = "18888888888"

解释信息如下。

+----+-------------+---------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys       | key                 | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | student | NULL       | ref  | phone_sex_des_index | phone_sex_des_index | 63      | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------+

结果显示是使用了组合索引。这是因为查询条件满足组合索引phone_sex_des_index的最左匹配原则,所以使用组合索引时没有了索引跳跃扫描的开销,最终优化器判断回表的性能开销小于全表扫描,故选择使用了组合索引。

再看下面这条SQL语句。

SELECT
  stu_phone,
  stu_sex,
  stu_des
FROM
  student 
WHERE
  stu_des = "Able to endure hardship";

注意到是满足覆盖索引的情况)解释上述SQL语句,结果如下。

+----+-------------+---------+------------+-------+---------------------+---------------------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys       | key                 | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------------+---------------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | student | NULL       | index | phone_sex_des_index | phone_sex_des_index | 892     | NULL |    4 |    25.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------------+---------------------+---------+------+------+----------+--------------------------+

满足覆盖索引的前提下,因为没了回表的开销,所以优化器判断了不满足最左匹配原则时使用索引跳跃扫描的开销与全表扫描的开销,最终判定前者开销小于后者,故使用了组合索引,这也是我们需要关注的点,即不满足最左匹配原则时组合索引也是可能被使用的(限于MySQL8.x版本)。

最后看下面这条SQL语句。

SELECT
  stu_phone,
  stu_sex,
  stu_des
FROM
  student 
WHERE
  stu_phone = "18888888888"
AND
  stu_sex = "male";

这既符合覆盖索引,又符合最左匹配原则,此时肯定是会使用组合索引phone_sex_des_index的,解释信息如下。

+----+-------------+---------+------------+------+---------------------+---------------------+---------+-------------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys       | key                 | key_len | ref         | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------------+---------------------+---------+-------------+------+----------+-------------+
|  1 | SIMPLE      | student | NULL       | ref  | phone_sex_des_index | phone_sex_des_index | 125     | const,const |    1 |   100.00 | Using index |
+----+-------------+---------+------------+------+---------------------+---------------------+---------+-------------+------+----------+-------------+

五. 模糊匹配导致索引失效

现在看一下如下的这条模糊查询SQL语句。

SELECT 
  stu_name,
  stu_age,
  stu_phone
FROM
  student
WHERE
  stu_name LIKE "%Bi";

解释上述语句,信息如下所示。

+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | student | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    25.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+

因为模糊匹配的条件 %Bi的通配符在最左边,MySQL无法准确使用索引,故此时索引会失效。现在看一下如下这条SQL语句。

SELECT 
  stu_name,
  stu_age,
  stu_phone
FROM
  student
WHERE
  stu_name LIKE "Bi%";

解释上述SQL,结果如下所示。

+----+-------------+---------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
| id | select_type | table   | partitions | type  | possible_keys | key        | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | student | NULL       | range | name_index    | name_index | 767     | NULL |    2 |   100.00 | Using index condition |
+----+-------------+---------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+

通配符不在最左边时是会使用索引的。

五. OR条件连接导致索引失效

查询时使用OR进行条件连接,如果OR语句前后没有同时使用索引,那么索引会失效。

首先看一下如下这条SQL

SELECT 
  stu_name,
  stu_age,
  stu_phone
FROM
  student
WHERE
  stu_name = "Bill"
OR 
  stu_age = 22

解释上述SQL语句,结果如下。

+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | student | NULL       | ALL  | name_index    | NULL | NULL    | NULL |    4 |    50.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+

因为只有stu_name有索引,此时不满足OR语句前后都使用索引,最终索引失效。

下面先为stu_age添加索引。

ALTER TABLE student ADD INDEX age_index ( stu_age );

然后重新解释上述SQL语句,结果如下。

+----+-------------+---------+------------+-------------+----------------------+----------------------+---------+------+------+----------+------------------------------------------------+
| id | select_type | table   | partitions | type        | possible_keys        | key                  | key_len | ref  | rows | filtered | Extra                                          |
+----+-------------+---------+------------+-------------+----------------------+----------------------+---------+------+------+----------+------------------------------------------------+
|  1 | SIMPLE      | student | NULL       | index_merge | name_index,age_index | name_index,age_index | 767,4   | NULL |    3 |   100.00 | Using union(name_index,age_index); Using where |
+----+-------------+---------+------------+-------------+----------------------+----------------------+---------+------+------+----------+------------------------------------------------+

此时OR语句前后都有索引,因此索引没有失效。


总结不易,如果本文对你有帮助,烦请点赞,收藏加关注,谢谢帅气漂亮的你。

可爱表情