掘金 后端 ( ) • 2024-03-29 14:59

本文分享自华为云社区《GaussDB(DWS)的cgroup、资源池、用户的关系》,作者: nullptr_。

1. 前言

本文主要展示了DWS中cgroup、资源池、用户之间的关系,从而对DWS的资源设置情况有个初步了解。

2. 相关对象创建脚本

gs_ssh -c "gs_cgroup -cS ClassN1 -G wn1"
gs_ssh -c "gs_cgroup -cS ClassN1 -G wn2"
gs_ssh -c "gs_cgroup -cS ClassN2 -G wn3"
gs_ssh -c "gs_cgroup -cS ClassG1 -G wg1_1"
gs_ssh -c "gs_cgroup -cS ClassG1 -G wg1_2"
gs_ssh -c "gs_cgroup -cS ClassG2 -G wg2_1"
gs_ssh -c "gs_cgroup -cS ClassG2 -G wg2_2"

#创建资源池
gsql -d postgres -p 6000 -c  "create resource pool respool_1 with (control_group = 'ClassN1:wn1');"
gsql -d postgres -p 6000 -c  "create resource pool respool_2 with (control_group = 'ClassN1:wn2');"
gsql -d postgres -p 6000 -c  "create resource pool respool_3 with (control_group = 'ClassN2:wn3');"
gsql -d postgres -p 6000 -c  "create resource pool respool_4 with (control_group = 'ClassN2:wn3');"
gsql -d postgres -p 6000 -c  "create resource pool respool_grp_1 with (control_group = 'ClassG1');"
gsql -d postgres -p 6000 -c  "create resource pool respool_g1_job_1 with (control_group = 'ClassG1:wg1_1');"
gsql -d postgres -p 6000 -c  "create resource pool respool_g1_job_2 with (control_group = 'ClassG1:wg1_2');"
gsql -d postgres -p 6000 -c  "create resource pool respool_grp_2 with (control_group = 'ClassG2');"
gsql -d postgres -p 6000 -c  "create resource pool respool_g2_job_1 with (control_group = 'ClassG2:wg2_1');"
gsql -d postgres -p 6000 -c  "create resource pool respool_g2_job_2 with (control_group = 'ClassG2:wg2_2');"

#创建租户,创建用户
gsql -d postgres -p 6000 -c  "CREATE USER user_1 RESOURCE POOL 'respool_1' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_2 RESOURCE POOL 'respool_2' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_3 RESOURCE POOL 'respool_3' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_4 PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_5 PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_grp_1 RESOURCE POOL 'respool_grp_1' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_g1_job_1 RESOURCE POOL 'respool_g1_job_1' USER GROUP 'user_grp_1' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_g1_job_2 RESOURCE POOL 'respool_g1_job_2' USER GROUP 'user_grp_1' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_grp_2 RESOURCE POOL 'respool_grp_2' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_g2_job_1 RESOURCE POOL 'respool_g2_job_1' USER GROUP 'user_grp_2' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_g2_job_2 RESOURCE POOL 'respool_g2_job_2' USER GROUP 'user_grp_2' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c  "CREATE USER user_grp_3 RESOURCE POOL 'respool_grp_1' PASSWORD 'Gauss_ab1' ;"

3. cgroup

GaussDB(DWS)资源负载管理的核心是资源池,而配置资源池首先要在环境中实现控制组Cgroups的设置。
Class控制组为数据库业务运行所在的顶层控制组,集群部署时会自动生成默认子Class控制组“DefaultClass”。DefaultClass的Medium控制组会含有系统触发的作业在运行,该控制组不允许进行资源修改,且运行在该控制组上的作业不受资源管理的控制,所以推荐创建新的子Class及其Workload控制组来设置资源比例。

3.1 运行脚本之后cgroup分布情况如下

per910mas@xx:~> gs_cgroup -p

Top Group information is listed:
GID:   0 Type: Top    Percent(%): 1000( 50) Name: Root                  Cores: 0-103
GID:   1 Type: Top    Percent(%):  833( 83) Name: Gaussdb:per910mas     Cores: 0-103
GID:   2 Type: Top    Percent(%):  333( 40) Name: Backend               Cores: 0-103
GID:   3 Type: Top    Percent(%):  499( 60) Name: Class                 Cores: 0-103

Backend Group information is listed:
GID:   4 Type: BAKWD  Name: DefaultBackend   TopGID:   2 Percent(%): 266(80) Cores: 0-103
GID:   5 Type: BAKWD  Name: Vacuum           TopGID:   2 Percent(%):  66(20) Cores: 0-103

Class Group information is listed:
GID:  20 Type: CLASS  Name: DefaultClass     TopGID:   3 Percent(%):  99(20) MaxLevel: 1 RemPCT: 100 Cores: 0-103
GID:  21 Type: CLASS  Name: ClassN1          TopGID:   3 Percent(%):  99(20) MaxLevel: 3 RemPCT:  60 Cores: 0-103
GID:  22 Type: CLASS  Name: ClassN2          TopGID:   3 Percent(%):  99(20) MaxLevel: 2 RemPCT:  80 Cores: 0-103
GID:  23 Type: CLASS  Name: ClassG1          TopGID:   3 Percent(%):  99(20) MaxLevel: 3 RemPCT:  60 Cores: 0-103
GID:  24 Type: CLASS  Name: ClassG2          TopGID:   3 Percent(%):  99(20) MaxLevel: 3 RemPCT:  60 Cores: 0-103

Workload Group information is listed:
GID:  86 Type: DEFWD  Name: wn1:2            ClsGID:  21 Percent(%):  19(20) WDLevel:  2 Cores: 0-103
GID:  87 Type: DEFWD  Name: wn2:3            ClsGID:  21 Percent(%):  19(20) WDLevel:  3 Cores: 0-103
GID:  89 Type: DEFWD  Name: wn3:2            ClsGID:  22 Percent(%):  19(20) WDLevel:  2 Cores: 0-103
GID:  91 Type: DEFWD  Name: wg1_1:2          ClsGID:  23 Percent(%):  19(20) WDLevel:  2 Cores: 0-103
GID:  92 Type: DEFWD  Name: wg1_2:3          ClsGID:  23 Percent(%):  19(20) WDLevel:  3 Cores: 0-103
GID:  94 Type: DEFWD  Name: wg2_1:2          ClsGID:  24 Percent(%):  19(20) WDLevel:  2 Cores: 0-103
GID:  95 Type: DEFWD  Name: wg2_2:3          ClsGID:  24 Percent(%):  19(20) WDLevel:  3 Cores: 0-103

CM Group information is listed:

Timeshare Group information is listed:
GID: 724 Type: TSWD   Name: Low              Rate: 1
GID: 725 Type: TSWD   Name: Medium           Rate: 2
GID: 726 Type: TSWD   Name: High             Rate: 4
GID: 727 Type: TSWD   Name: Rush             Rate: 8

系统资源限制分为配额限额。默认情况下为配额

  • 配额:配额是一种比较灵活的控制方式,例如wn1:2的配额为20%,在正常情况下组内资源使用可以超过20%,如果在资源繁忙的情况下(使用率为100%)则资源严格按照配额的大小进行限制
  • 限额:限额则直接限制CPU使用的核数的范围。
  • 配额&限额:则在CPU核数范围内限制配额比例

4. 资源池

4.1 资源池分布情况如下

postgres=# select oid,* from pg_resource_pool;
    oid     |   respool_name   | mem_percent | cpu_affinity |    control_group    | active_statements | max_dop | memory_limit |  parentid  | io_limits | io_priority |  nodegroup   | is_foreign | short_acc | except_rule | weight
------------+------------------+-------------+--------------+---------------------+-------------------+---------+--------------+------------+-----------+-------------+--------------+------------+-----------+-------------+--------
         10 | default_pool     |           0 |           -1 | DefaultClass:Medium |                -1 |      -1 | default      |          0 |         0 | None        | installation | f          | t         | None        |     -1
 2147585814 | respool_1        |           0 |           -1 | ClassN1:wn1         |                10 |      -1 | default      |          0 |         0 | None        | installation | f          | t         | None        |     -1
 2147585815 | respool_2        |           0 |           -1 | ClassN1:wn2         |                10 |      -1 | default      |          0 |         0 | None        | installation | f          | t         | None        |     -1
 2147585816 | respool_3        |           0 |           -1 | ClassN2:wn3         |                10 |      -1 | default      |          0 |         0 | None        | installation | f          | t         | None        |     -1
 2147585817 | respool_grp_1    |          20 |           -1 | ClassG1             |                10 |      -1 | default      |          0 |         0 | None        | installation | f          | t         | None        |     -1
 2147585818 | respool_g1_job_1 |          20 |           -1 | ClassG1:wg1_1       |                10 |      -1 | default      | 2147585817 |         0 | None        | installation | f          | t         | None        |     -1
 2147585819 | respool_g1_job_2 |          20 |           -1 | ClassG1:wg1_2       |                10 |      -1 | default      | 2147585817 |         0 | None        | installation | f          | t         | None        |     -1
 2147585820 | respool_grp_2    |          20 |           -1 | ClassG2             |                10 |      -1 | default      |          0 |         0 | None        | installation | f          | t         | None        |     -1
 2147585821 | respool_g2_job_1 |          20 |           -1 | ClassG2:wg2_1       |                10 |      -1 | default      | 2147585820 |         0 | None        | installation | f          | t         | None        |     -1
 2147585822 | respool_g2_job_2 |          20 |           -1 | ClassG2:wg2_2       |                10 |      -1 | default      | 2147585820 |         0 | None        | installation | f          | t         | None        |     -1
 2147586195 | respool_4        |           0 |           -1 | ClassN2:wn3         |                10 |      -1 | default      |          0 |         0 | None        | installation | f          | t         | None        |     -1
(11 rows)

4.1.1 组资源池限制

per910mas@xx:~> gsql -d postgres -p 6000 -c  "create resource pool respool_grp_3 with (control_group = 'ClassG1');"
ERROR:  resource pool with control_group ClassG1 has been existed in the two-layer resource pool list

4.1.2 业务资源池

资源池的内存资源计算mem_percent需要按照层级进行比例计算

4.1.3 默认资源池

如果开启了资源管理功能,则系统会默认创建一个资源池default_pool。当一个会话或者用户没有指定关联的资源池时,都会被默认关联到default_pool。default_pool默认绑定DefaultClass:Medium控制组,同时并发和内存默认不管控,default_pool支持参数修改,但关联default_pool的作业会受到max_active_statements全局并发限制。当管理员执行运维操作不需要进行管控时,需要在执行SQL前执行SET session_respool=‘root’;将资源池切换至运维队列,此时作业将不受控。

5. 用户

5.1 用户分布情况

postgres=# select * from pg_user;、
    usename    |  usesysid  | usecreatedb | usesuper | usecatupd | userepl |  passwd  | valbegin | valuntil |     respool      |   parent   | spacelimit | useconfig | nodegroup | tempspacelimit | spillspacelimit
---------------+------------+-------------+----------+-----------+---------+----------+----------+----------+------------------+------------+------------+-----------+-----------+----------------+-----------------
 per910mas     |         10 | t           | t        | t         | t       | ******** |          |          | default_pool     |          0 |            |           |           |                |
 u1            | 2147558961 | f           | f        | f         | f       | ******** |          |          | default_pool     |          0 |            |           |           |                |
 user_1        | 2147585823 | f           | f        | f         | f       | ******** |          |          | respool_1        |          0 |            |           |           |                |
 user_2        | 2147585827 | f           | f        | f         | f       | ******** |          |          | respool_2        |          0 |            |           |           |                |
 user_3        | 2147585831 | f           | f        | f         | f       | ******** |          |          | respool_3        |          0 |            |           |           |                |
 user_4        | 2147585835 | f           | f        | f         | f       | ******** |          |          | default_pool     |          0 |            |           |           |                |
 user_5        | 2147585839 | f           | f        | f         | f       | ******** |          |          | default_pool     |          0 |            |           |           |                |
 user_grp_1    | 2147585843 | f           | f        | f         | f       | ******** |          |          | respool_grp_1    |          0 |            |           |           |                |
 user_g1_job_1 | 2147585847 | f           | f        | f         | f       | ******** |          |          | respool_g1_job_1 | 2147585843 |            |           |           |                |
 user_g1_job_2 | 2147585851 | f           | f        | f         | f       | ******** |          |          | respool_g1_job_2 | 2147585843 |            |           |           |                |
 user_grp_2    | 2147585855 | f           | f        | f         | f       | ******** |          |          | respool_grp_2    |          0 |            |           |           |                |
 user_g2_job_1 | 2147585859 | f           | f        | f         | f       | ******** |          |          | respool_g2_job_1 | 2147585855 |            |           |           |                |
 user_g2_job_2 | 2147585863 | f           | f        | f         | f       | ******** |          |          | respool_g2_job_2 | 2147585855 |            |           |           |                |
 user_grp_3    | 2147586254 | f           | f        | f         | f       | ******** |          |          | respool_grp_1    |          0 |            |           |           |                |
(14 rows)

5.1.2 多租户场景

  • 业务用户共享组用户的资源,组用户共享其所在资源池的资源。
  • 业务用户必须挂在到组用户下,且层级必须与资源池层级一一对应

点击关注,第一时间了解华为云新鲜技术~