掘金 后端 ( ) • 2024-05-05 16:58

image.png

同事找我优化SQL,同一条SQL语句LIKE过滤条件不同,执行时间差别很多,废话不说安排一下。

LIKE过滤条件执行快的SQL和执行计划:

EXPLAIN ANALYZE
SELECT case_id,
       cate_id,
       cate_name,
       view_url,
       proc_ins_id,
       create_user_id,
       current_user_id,
       title,
       emergency,
       dept_id,
       handler_id,
       handle_date,
       create_date,
       end_date,
       proc_ins_state,
       hint
FROM AAAAAA H
WHERE handler_id = '8feae683-f741-45de-b430-7db24b4f7660'
  AND title like '%通报%'
ORDER BY handle_date desc nulls last, emergency DESC;


Sort  (cost=83260.22..83263.92 rows=1479 width=463) (actual time=203.872..205.642 rows=1750 loops=1)
  Sort Key: H.handle_date DESC NULLS LAST, P.emer_level DESC
  Sort Method: quicksort  Memory: 968kB
  ->  Gather  (cost=12513.91..83182.35 rows=1479 width=463) (actual time=146.637..204.384 rows=1750 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Hash Join  (cost=11513.91..69536.90 rows=616 width=452) (actual time=140.984..182.745 rows=583 loops=3)
              Hash Cond: (P.cate_id = C.id)
              ->  Parallel Hash Join  (cost=11506.98..69528.32 rows=616 width=408) (actual time=140.808..182.364 rows=583 loops=3)
                    Hash Cond: (H.proc_ins_id = P.id)
                    ->  Parallel Bitmap Heap Scan on ccccccccccc H  (cost=443.78..58432.47 rows=12438 width=156) (actual time=6.527..46.555 rows=10375 loops=3)
                          Recheck Cond: (handler_id = '8feae683-f741-45de-b430-7db24b4f7660'::bpcharbyte)
                          Filter: ((state = 'A'::bpcharbyte) AND (category = 'H'::bpcharbyte))
                          Heap Blocks: exact=2799
                          ->  Bitmap Index Scan on PROC_INS_HANDLERS_HANDLER_ID  (cost=0.00..436.31 rows=29851 width=0) (actual time=7.405..7.405 rows=31125 loops=1)
                                Index Cond: (handler_id = '8feae683-f741-45de-b430-7db24b4f7660'::bpcharbyte)
                    ->  Parallel Hash  (cost=10935.95..10935.95 rows=10180 width=289) (actual time=132.111..132.112 rows=4266 loops=3)
                          Buckets: 32768  Batches: 1  Memory Usage: 4480kB
                          ->  Parallel Seq Scan on ddddddddddddd P  (cost=0.00..10935.95 rows=10180 width=289) (actual time=0.049..126.563 rows=4266 loops=3)
                                Filter: ((title)::text ~~ '%通报%'::text)
                                Rows Removed by Filter: 160050
              ->  Hash  (cost=4.19..4.19 rows=219 width=81) (actual time=0.079..0.080 rows=221 loops=3)
                    Buckets: 1024  Batches: 1  Memory Usage: 33kB
                    ->  Seq Scan on eeeeeeeeeeeee C  (cost=0.00..4.19 rows=219 width=81) (actual time=0.016..0.040 rows=221 loops=3)
        SubPlan 1
          ->  Index Scan using wf_act_instances_PKEY on wf_act_instances  (cost=0.43..8.45 rows=1 width=14) (actual time=0.011..0.012 rows=0 loops=1750)
                Index Cond: (id = H.act_ins_id)
Planning Time: 1.185 ms
Execution Time: 205.909 ms

LIKE过滤条件执行慢的SQL和执行计划:

EXPLAIN ANALYZE
SELECT case_id,
       cate_id,
       cate_name,
       view_url,
       proc_ins_id,
       create_user_id,
       current_user_id,
       title,
       emergency,
       dept_id,
       handler_id,
       handle_date,
       create_date,
       end_date,
       proc_ins_state,
       hint
FROM AAAAAA H
WHERE handler_id = '8feae683-f741-45de-b430-7db24b4f7660'
  AND title like '%表扬%'
ORDER BY handle_date desc nulls last, emergency DESC;

Sort  (cost=21036.84..21036.85 rows=3 width=463) (actual time=2023.884..2023.948 rows=139 loops=1)
  Sort Key: H.handle_date DESC NULLS LAST, P.emer_level DESC
  Sort Method: quicksort  Memory: 97kB
  ->  Gather  (cost=1449.85..21036.82 rows=3 width=463) (actual time=9.150..2023.615 rows=139 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Nested Loop  (cost=449.85..20011.17 rows=1 width=452) (actual time=25.891..2010.559 rows=46 loops=3)
              ->  Nested Loop  (cost=449.71..20010.66 rows=1 width=408) (actual time=25.841..2010.065 rows=46 loops=3)
                    ->  Parallel Seq Scan on ddddddddddddd P  (cost=0.00..10935.95 rows=20 width=289) (actual time=0.616..138.214 rows=185 loops=3)
                          Filter: ((title)::text ~~ '%表扬%'::text)
                          Rows Removed by Filter: 164131
                    ->  Bitmap Heap Scan on ccccccccccc H  (cost=449.71..453.73 rows=1 width=156) (actual time=10.097..10.097 rows=0 loops=556)
                          Recheck Cond: ((proc_ins_id = P.id) AND (handler_id = '8feae683-f741-45de-b430-7db24b4f7660'::bpcharbyte))
                          Filter: ((state = 'A'::bpcharbyte) AND (category = 'H'::bpcharbyte))
                          Heap Blocks: exact=29
                          ->  BitmapAnd  (cost=449.71..449.71 rows=1 width=0) (actual time=10.093..10.093 rows=0 loops=556)
                                ->  Bitmap Index Scan on PUBLIC_ccccccccccc_INDEX_7  (cost=0.00..5.68 rows=166 width=0) (actual time=0.042..0.042 rows=77 loops=556)
                                      Index Cond: (proc_ins_id = P.id)
                                ->  Bitmap Index Scan on PROC_INS_HANDLERS_HANDLER_ID  (cost=0.00..436.31 rows=29851 width=0) (actual time=9.495..9.495 rows=31125 loops=556)
                                      Index Cond: (handler_id = '8feae683-f741-45de-b430-7db24b4f7660'::bpcharbyte)
              ->  Index Scan using eeeeeeeeeeeee_PKEY on eeeeeeeeeeeee C  (cost=0.14..0.50 rows=1 width=81) (actual time=0.007..0.007 rows=1 loops=139)
                    Index Cond: (id = P.cate_id)
        SubPlan 1
          ->  Index Scan using wf_act_instances_PKEY on wf_act_instances  (cost=0.43..8.45 rows=1 width=14) (actual time=0.014..0.014 rows=1 loops=139)
                Index Cond: (id = H.act_ins_id)
Planning Time: 1.753 ms
Execution Time: 2024.430 ms

AAAAAA 表是个视图,定义如下:

-----------------+--------------------------------+----------+--------+------+----------+------
 case_id         | character(36 byte)             |          |        |      | extended | 
 cate_id         | character(36 byte)             |          |        |      | extended | 
 CATE_NAME       | character varying(100 char)    |          |        |      | extended | 
 VIEW_URL        | character varying(1024 char)   |          |        |      | extended | 
 PROC_INS_ID     | character(36 byte)             |          |        |      | extended | 
 create_user_id  | character(36 byte)             |          |        |      | extended | 
 current_user_id | character(36 byte)             |          |        |      | extended | 
 title           | character varying(600 char)    |          |        |      | extended | 
 EMERGENCY       | character(1 byte)              |          |        |      | extended | 
 dept_id         | character(36 byte)             |          |        |      | extended | 
 handler_id      | character(36 byte)             |          |        |      | extended | 
 handle_date     | timestamp(3) without time zone |          |        |      | plain    | 
 create_date     | timestamp(3) without time zone |          |        |      | plain    | 
 end_date        | timestamp(3) without time zone |          |        |      | plain    | 
 PROC_INS_STATE  | character(1 byte)              |          |        |      | extended | 
 HINT            | character varying(60 byte)     |          |        |      | extended |

SELECT P.case_id,
    P.cate_id,
    C.name AS CATE_NAME,
    C.url AS VIEW_URL,
    P.id AS PROC_INS_ID,
    P.create_user_id,
    P.current_user_id,
    P.title,
    P.emer_level AS EMERGENCY,
    H.dept_id,
    H.handler_id,
    H.handle_date,
    P.create_date,
    P.end_date,
    P.state AS PROC_INS_STATE,
    ( SELECT wf_act_instances.hint
           FROM wf_act_instances
          WHERE wf_act_instances.id = H.act_ins_id) AS HINT
   FROM ccccccccccc H,
    ddddddddddddd P,
    eeeeeeeeeeeee C
  WHERE H.proc_ins_id = P.id AND H.state = 'A'::bpchar::bpcharbyte AND H.category = 'H'::bpchar::bpcharbyte AND C.id = P.cate_id;

1、可以看到,两条SQL是除了LIKE模糊查询的过滤条件不一样,其他的写法是完全一致,

2、where title like '%通报%' 执行时间需要205ms,返回数据 1750 行,

3、where title like '%表扬%' 执行时间需要2024ms,返回数据 139 行,

4、正常情况下应该是谓词过滤条件为 '%表扬%' 的SQL语句执行速度更快才是,毕竟返回的数据更少,事实上这个过滤条件比前者慢了10倍。

前者执行计划:

image.png

后者执行计划:

image.png

通过对比发现后者SQL语句的表关联条件 proc_ins_id = P.id 竟然又 Recheck Cond 一次(回表)。

而前者的表关联条件是没有进行回表的 Hash Cond: (H.proc_ins_id = P.id),确定找到慢的地方。

增加联合索引优化:

create index idx_ccccccccccc_1_2 on ccccccccccc(proc_ins_id,handler_id);

优化后执行SQL语句:

优化后执行SQL语句:

EXPLAIN ANALYZE
SELECT case_id,
       cate_id,
       cate_name,
       view_url,
       proc_ins_id,
       create_user_id,
       current_user_id,
       title,
       emergency,
       dept_id,
       handler_id,
       handle_date,
       create_date,
       end_date,
       proc_ins_state,
       hint
FROM AAAAAA H
WHERE handler_id = '8feae683-f741-45de-b430-7db24b4f7660'
  AND title like '%表扬%'
ORDER BY handle_date desc nulls last, emergency DESC
                                                                                   QUERY PLAN                                                                                    
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=12129.97..12129.97 rows=3 width=464) (actual time=159.627..162.448 rows=139 loops=1)
   Sort Key: H.handle_date DESC NULLS LAST, P.emer_level DESC
   Sort Method: quicksort  Memory: 97kB
   ->  Gather  (cost=1000.58..12129.94 rows=3 width=464) (actual time=1.069..162.167 rows=139 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop  (cost=0.58..11104.29 rows=1 width=453) (actual time=1.615..148.219 rows=46 loops=3)
               ->  Nested Loop  (cost=0.43..11103.79 rows=1 width=409) (actual time=1.585..148.059 rows=46 loops=3)
                     ->  Parallel Seq Scan on ddddddddddddd P  (cost=0.00..10934.44 rows=20 width=290) (actual time=1.197..145.137 rows=185 loops=3)
                           Filter: ((title)::text ~~ '%表扬%'::text)
                           Rows Removed by Filter: 164131
                     ->  Index Scan using idx_ccccccccccc_1_2 on ccccccccccc H  (cost=0.43..8.46 rows=1 width=156) (actual time=0.015..0.015 rows=0 loops=556)
                           Index Cond: ((proc_ins_id = P.id) AND (handler_id = '8feae683-f741-45de-b430-7db24b4f7660'::bpcharbyte))
                           Filter: ((state = 'A'::bpcharbyte) AND (category = 'H'::bpcharbyte))
               ->  Index Scan using eeeeeeeeeeeee_PKEY on eeeeeeeeeeeee C  (cost=0.14..0.50 rows=1 width=81) (actual time=0.003..0.003 rows=1 loops=139)
                     Index Cond: (id = P.cate_id)
         SubPlan 1
           ->  Index Scan using wf_act_instances_PKEY on wf_act_instances  (cost=0.43..8.45 rows=1 width=14) (actual time=0.013..0.013 rows=1 loops=139)
                 Index Cond: (id = H.act_ins_id)
 Planning Time: 1.443 ms
 Execution Time: 162.906 ms
(21 行记录)


EXPLAIN ANALYZE
SELECT case_id,
       cate_id,
       cate_name,
       view_url,
       proc_ins_id,
       create_user_id,
       current_user_id,
       title,
       emergency,
       dept_id,
       handler_id,
       handle_date,
       create_date,
       end_date,
       proc_ins_state,
       hint
FROM AAAAAA H
WHERE handler_id = '8feae683-f741-45de-b430-7db24b4f7660'
  AND title like '%通报%'
ORDER BY handle_date desc nulls last, emergency DESC;

                                                                                    QUERY PLAN                                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=16733.04..16733.23 rows=73 width=464) (actual time=246.592..252.130 rows=1750 loops=1)
   Sort Key: H.handle_date DESC NULLS LAST, P.emer_level DESC
   Sort Method: quicksort  Memory: 968kB
   ->  Gather  (cost=1000.58..16730.78 rows=73 width=464) (actual time=0.596..250.878 rows=1750 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop  (cost=0.58..15106.63 rows=30 width=453) (actual time=0.435..228.229 rows=583 loops=3)
               ->  Nested Loop  (cost=0.43..15101.36 rows=30 width=409) (actual time=0.403..224.391 rows=583 loops=3)
                     ->  Parallel Seq Scan on ddddddddddddd P  (cost=0.00..10934.44 rows=499 width=290) (actual time=0.062..135.724 rows=4266 loops=3)
                           Filter: ((title)::text ~~ '%通报%'::text)
                           Rows Removed by Filter: 160050
                     ->  Index Scan using idx_ccccccccccc_1_2 on ccccccccccc H  (cost=0.43..8.34 rows=1 width=156) (actual time=0.021..0.021 rows=0 loops=12797)
                           Index Cond: ((proc_ins_id = P.id) AND (handler_id = '8feae683-f741-45de-b430-7db24b4f7660'::bpcharbyte))
                           Filter: ((state = 'A'::bpcharbyte) AND (category = 'H'::bpcharbyte))
               ->  Index Scan using eeeeeeeeeeeee_PKEY on eeeeeeeeeeeee C  (cost=0.14..0.18 rows=1 width=81) (actual time=0.006..0.006 rows=1 loops=1750)
                     Index Cond: (id = P.cate_id)
         SubPlan 1
           ->  Index Scan using wf_act_instances_PKEY on wf_act_instances  (cost=0.43..8.45 rows=1 width=14) (actual time=0.015..0.015 rows=0 loops=1750)
                 Index Cond: (id = H.act_ins_id)
 Planning Time: 1.551 ms
 Execution Time: 252.611 ms
(21 行记录)

可以看到,两个谓词过滤条件不同的SQL语句执行时间已经相差无几,在正式环境上创建索引后SQL运行速度有明显提升,本案例已经优化完毕。

image.png