掘金后端 ( ) • 2024-03-28 15:37

k8s根据业务qps-动态扩容方案实践(HPA)

在现代的云原生应用中，动态调整资源的能力至关重要。Kubernetes提供了水平自动伸缩（Horizontal Pod Autoscaling，HPA）的功能，可以根据应用程序的负载情况自动增加或减少 Pod 的副本数量，从而实现资源的有效利用和高可用性。

Kubernetes 中的水平自动伸缩（HPA）简介

水平自动伸缩是 Kubernetes 中的一项核心功能，它允许根据应用程序的负载自动调整 Pod 的数量，以便应对流量的变化而不会引起资源浪费或性能问题。HPA 可以根据预定义的指标自动触发伸缩操作，例如 CPU 使用率、内存使用率等。

HPA 的工作原理

Kubernetes 的 HPA 控制器周期性地检查指定的资源指标（如 CPU 使用率）是否超出了用户定义的阈值。当资源指标超过或低于阈值时，HPA 将触发扩展或收缩操作，调整 Pod 的数量以满足应用程序的需求。这一过程是自动化的，无需人工干预，大大简化了资源管理的工作。

HPA 的优势

自动化管理： HPA 可以根据实际负载自动调整 Pod 的数量，减少了手动干预和运维成本。
弹性伸缩： 应用程序可以根据流量的变化动态扩展或收缩，保证了应用程序的稳定性和可用性。
资源优化： 通过动态调整 Pod 的数量，HPA 可以确保资源得到有效利用，避免了资源浪费和性能下降的问题。

cpu和内存指标伸缩

Kubernetes HPA默认支持对cpu、内存等指标，例如：

cpu资源HPA

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: y-install-hpa-c
  namespace: public-service
  labels:
    app: y-install-hpa-c
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: y-install
  minReplicas: 2
  maxReplicas: 4
  metrics:
    - type: Resource
      resource:
        name: cpu
        targetAverageUtilization: 80

内存资源HPA

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: y-install-hpa-m
  namespace: public-service
  labels:
    app: y-install-hpa-m
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: y-install
  minReplicas: 2
  maxReplicas: 4
  metrics:
    - type: Resource
      resource:
        name: memory
        targetAverageUtilization: 80

但在某些场景下我们可能需要更定制化的指标，例如：

接口请求qps
接口平均响应时间
数据库连接池数量
其他业务自定义指标....

要完成上面的监控，我们面临的问题：

如何采集业务Pod的QPS等自定义指标？
采集到了指标后如何和HPA联动？

业务自定义监控数据采集

Prometheus在云原生应用监控领域已经是业界的监控标准，Prometheus也提供了Kubernetes版本的operator，使用promethues提供的各语言sdk可以轻松集成。

这里不具体展开Prometheus接入流程，具体可参考：微服务监控之Prometheus golang实践篇：从0开始搭建业务监控 - 掘金 (juejin.cn)

采集到指标如何和HPA联动

通过promethues收集到业务自定义指标后，我们需要将指标提供给Kubernetes HPA获取，Kubernetes提供了自定义指标API（Custom Metrics API）

什么是自定义指标 API？

自定义指标 API 是 Kubernetes 提供的一种机制，用于支持基于自定义数据的水平自动伸缩（HPA）。与传统的基于资源使用率（如 CPU 和内存）的自动伸缩不同，自定义指标 API 允许用户根据应用程序的实际需求定义和使用自定义指标进行伸缩。

自定义指标 API 的作用

支持更多类型的指标： 传统的 HPA 只支持基本的资源使用率指标，而自定义指标 API 可以支持各种类型的自定义指标，如队列长度、响应时间等。
更精细的调整策略： 自定义指标 API 允许用户定义更灵活和精细的伸缩策略，可以更好地适应复杂的应用场景。
更好的资源利用： 通过根据实际业务需求进行伸缩，自定义指标 API 可以更好地利用资源，提高资源利用率和性能。

自定义指标 API 的架构和组件

Prometheus Adapter： Prometheus Adapter 是一个 Kubernetes 控制器，用于将 Prometheus 监控数据转换成 Kubernetes 自定义指标 API 的格式，使得 Kubernetes 可以利用 Prometheus 中收集的指标进行水平自动伸缩。
Metrics Server： Metrics Server 是 Kubernetes 官方提供的一个组件，用于收集和暴露集群中各种资源的使用情况，包括 CPU、内存等。在一些集群中，Metrics Server 也可以用于收集自定义指标，但它的功能相对有限，对于更复杂的自定义指标可能需要额外的配置和插件。
自定义指标 API Server： 自定义指标 API Server 是 Kubernetes 的一个插件，用于暴露自定义指标 API 给外部使用。它负责接收和处理来自 Prometheus Adapter 或其他数据源的自定义指标，并提供给 HPA 控制器使用。

下面将重点介绍如何通过Prometheus Adapter实现HPA自定义指标伸缩

prometheus adapter

Prometheus Adapter 是一个 Kubernetes 插件，它可以帮助将 Prometheus 监控数据转换为 Kubernetes API 对象，使得这些监控数据可以被 Kubernetes 系统中其他组件使用。其作用主要有以下几点：

提供了一种将 Prometheus 监控数据转换为 Kubernetes API 对象的方式，可以使得 Prometheus 监控数据被 Kubernetes 系统中其他组件使用。例如，可以将 Prometheus 监控数据转换为 Kubernetes 中的 HorizontalPodAutoscaler（HPA），从而根据监控数据自动调整应用程序的副本数量。

可以让 Kubernetes 系统中的其他组件使用 Prometheus 监控数据，例如 Prometheus Operator、kube-state-metrics 等。这些组件可以通过 Prometheus Adapter 与 Prometheus Server 进行交互，从而获得实时的监控数据。
Prometheus Adapter 还可以提供自定义的指标转换规则，可以将 Prometheus 监控数据转换为更加适合 Kubernetes 系统中的其他组件使用的指标，从而更好地支持自动化调整和缩放。

总之，Prometheus Adapter 可以将 Prometheus 监控数据与 Kubernetes 系统无缝集成，从而使得 Kubernetes 系统中的其他组件可以更好地利用这些数据，提高系统的可靠性和可用性。

github地址：https://github.com/kubernetes-sigs/prometheus-adapter

安装

$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update

# 因为国内被墙registry.Kubernetes.io/prometheus-adapter/prometheus-adapter:v0.11.2，所以用模板形式
#$ helm install my-release prometheus-community/prometheus-adapter  

helm template   prometheus-adapter prometheus-community/prometheus-adapter --namespace common > prometheus-adapter.yaml

修改配置

设置prometheus地址信息，将prometheus-url更改为你的prometheus服务地址

$ vim prometheus-adapter.yaml
spec:
  serviceAccountName: prometheus-adapter
  containers:
  - name: prometheus-adapter
    image: "docker.io/wsnbwz/prometheus-adapter:v0.11.1"
    imagePullPolicy: IfNotPresent
    args:
    - /adapter
    - --secure-port=6443
    - --cert-dir=/tmp/cert
    - --prometheus-url=http://172.16.36.192:9090
    - --metrics-relist-interval=1m
    - --v=4
    - --config=/etc/adapter/config.yaml

或者在导出的时候，指定配置

$vi value.yaml
 
prometheus:
  url: http://172.16.36.192
  port: 9090
  path: ""

#通过--values指定参数
helm install prometheus-adapter prometheus-community/prometheus-adapter --values value.yaml --namespace monitoring

安装完成后可以通过下面命令查询所有的自定义指标。

kubectl get --raw "/apis/custom.metrics.Kubernetes.io/v1beta1"

实践-根据业务qps实现HPA伸缩

开始接入前，需要满足以下条件：

部署prometheus服务，并且业务数据已经集成。
部署prometheus adapter，使自定义指标api能查询到prometheus中的数据。

服务暴露的指标

下面是本次演示服务已经暴露的指标，prometheus已经配置采集。

request_duration_count是一个counter类型的数据，统计服务的qps信息，我们将这个指标作为HPA参数依据。

#请求业务服务
curl http://172.16.66.202:9099/metrics

# HELP request_duration_count 
# TYPE request_duration_count counter
request_duration_count{path="/api/v1/init"} 32
request_duration_count{path="/api/v1/ip"} 282
request_duration_count{path="/api/v1/open"} 1
# HELP request_duration_latency_ms 
# TYPE request_duration_latency_ms counter
request_duration_latency_ms{path="/api/v1/init"} 1433
request_duration_latency_ms{path="/api/v1/ip"} 0
request_duration_latency_ms{path="/api/v1/open"} 181
# HELP request_latency_bucket 
# TYPE request_latency_bucket histogram
request_latency_bucket_bucket{le="5"} 282
request_latency_bucket_bucket{le="10"} 282
request_latency_bucket_bucket{le="20"} 282
request_latency_bucket_bucket{le="40"} 308
request_latency_bucket_bucket{le="80"} 312
request_latency_bucket_bucket{le="160"} 312
request_latency_bucket_bucket{le="320"} 315
request_latency_bucket_bucket{le="640"} 315
request_latency_bucket_bucket{le="1280"} 315
request_latency_bucket_bucket{le="2560"} 315
request_latency_bucket_bucket{le="5120"} 315
request_latency_bucket_bucket{le="10240"} 315
request_latency_bucket_bucket{le="+Inf"} 315
request_latency_bucket_sum 1614
request_latency_bucket_count 315

查看自定义指标

调用k8s metrics api查看上面的qps指标

调用metrics api时，因为我们安装了prometheus adapter，所以下面的请求会转发到adapter，adapter则会直接去prometheus中查询数据，如果这步没有查找到数据，检查之前的数据收集是否正常

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/public-service/pods/*/request_duration_count"

{
    "kind": "MetricValueList",
    "apiVersion": "custom.metrics.k8s.io/v1beta1",
    "metadata": {},
    "items": [
        {
            "describedObject": {
                "kind": "Pod",
                "namespace": "public-service",
                "name": "y-install-5cc86b8667-cfkvp",
                "apiVersion": "/v1"
            },
            "metricName": "request_duration_count",
            "timestamp": "2024-03-22T10:01:15Z",
            "value": "3",
            "selector": null
        },
        {
            "describedObject": {
                "kind": "Pod",
                "namespace": "public-service",
                "name": "y-install-5cc86b8667-qldr5",
                "apiVersion": "/v1"
            },
            "metricName": "request_duration_count",
            "timestamp": "2024-03-22T10:01:15Z",
            "value": "315",
            "selector": null
        }
    ]
}

这里查询出了2个pod，其中pod:y-install-5cc86b8667-qldr5中的value值315，正好是上面接口qps总和（3个接口）

现在，即可配置HPA

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: y-install-hpa-http-qps
  namespace: public-service
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    name: y-install
    kind: Deployment
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - type: Pods
      pods:
        metricName: request_duration_count
        targetAverageValue: 1000

metricName：替换为你的指标名称

targetAverageValue：根据pod的value取平均值，达到该值就触发HPA扩容

上面的案例中，pod总value=315，平均值=315/2

现在的问题：

qps监控数据类型为counter，这是一个累加值，按上面的HPA配置，HPA只会在累计平均qps到1000时触发一次扩容，如果想要实现n分钟内qps到达某个值触发HPA，按照prothuems的规则，counter类型需要自定义查询算出。

adapter支持我们自定义查询聚合数据

更新adapter配置

apiVersion: v1
kind: ConfigMap
metadata:
  name: adapter-config
  namespace: common
data:
  config.yaml: |-
    rules:
    - seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
      resources:
        template: <<.Resource>>
      name:
        matches: "^(.*)_duration_count"
        as: ""
      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)

配置matches和metricsQuery

正则规则：^(.)_duration_count，表示查询_duration_count结尾的指标，并将metricsQuery查询后的结果产生一个新指标：^(.)

以request_duration_count为例，这个指标的名字为request

更新配置文件、重启adapter

$ kubectl apply -f prometheus-adapter.yaml
# Restart
$ kubectl rollout restart deployment prometheus-adapter -n common

查询request指标


#y-in
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/public-service/pods/*/request"
{
    "kind": "MetricValueList",
    "apiVersion": "custom.metrics.Kubernetes.io/v1beta1",
    "metadata": {},
    "items": [
        {
            "describedObject": {
                "kind": "Pod",
                "namespace": "public-service",
                "name": "y-install-5cc86b8667-qldr5",
                "apiVersion": "/v1"
            },
            "metricName": "request",
            "timestamp": "2024-03-22T19:26:22Z",
            "value": "0",
            "selector": null
        }
    ]
}

可以看到此时request指标为0，表示当前1分钟内qps为0

请求几次接口后，再次查看request指标

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/public-service/pods/*/request"
{
    "kind": "MetricValueList",
    "apiVersion": "custom.metrics.Kubernetes.io/v1beta1",
    "metadata": {},
    "items": [
        {
            "describedObject": {
                "kind": "Pod",
                "namespace": "public-service",
                "name": "y-install-5cc86b8667-qldr5",
                "apiVersion": "/v1"
            },
            "metricName": "request",
            "timestamp": "2024-03-22T19:27:41Z",
            "value": "200m",
            "selector": null
        }
    ]
}

可以看到，现在指标生效了，value=200m，表示qps为0.2

HPA配置

request qps指标我们已经配置好了，接下来调整下HPA配置

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: y-install-hpa-http-qps
  namespace: public-service
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    name: y-install
    kind: Deployment
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - type: Pods
      pods:
        metricName: request
        targetAverageValue: 500m

metricName调整为request

targetAverageValue: 500m 当qps超过0.5时，触发HPA扩容规则

再次手动请求几次接口，查看指标

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/public-service/pods/*/request"
{
    "kind": "MetricValueList",
    "apiVersion": "custom.metrics.Kubernetes.io/v1beta1",
    "metadata": {},
    "items": [
        {
            "describedObject": {
                "kind": "Pod",
                "namespace": "public-service",
                "name": "y-install-5cc86b8667-qldr5",
                "apiVersion": "/v1"
            },
            "metricName": "request",
            "timestamp": "2024-03-22T19:47:41Z",
            "value": "600m",
            "selector": null
        }
    ]
}

已到600m，达到了我们配置的扩容阈值。

查看deployment事件

# 查看扩容
kubectl describe deployment y-install -n public-service

Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  101s  deployment-controller  Scaled up replica set y-install-5cc86b8667 to 2

可以看到触发了扩容规则，启动了一个新pod，至此我们完成了HPA qps扩容流程。

总结

通过prometheus adapter，我们完成了业务自定义指标HPA功能，本案例只演示了qps方案，adapter支持配置自定义查询，很方便的实现我们定制化的需求，在这基础上可以扩展一些其他方案，例如：qps平均响应时间、数据库连接池数量等等。

其他

HPA自定义指标多久查询一次

在 Kubernetes 中，HorizontalPodAutoscaler (HPA) 默认每 15 秒检查一次指标并进行缩放决策。这个检查的频率由 HPA 控制器中的 horizontal-pod-autoscaler-sync-period 参数确定，默认为 15 秒。

此外，HPA 还支持通过 --horizontal-pod-autoscaler-sync-period 参数自定义同步周期。您可以通过修改 HPA 控制器的启动参数来调整同步周期，以更改 HPA 检查指标的频率。

请注意，检查指标的频率和自动缩放的延迟取决于多种因素，包括指标收集的时间、指标源的响应时间以及 Kubernetes 集群中的负载等。因此，即使您调整了同步周期，实际的自动缩放延迟可能会有所不同。

高版本HPA配置

本案例是Kubernetes1.21版本，该版本使用老的 v2beta1 API，如果使用的是1.25以上版本，则应使用autoscaling/v2版本

```
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2
metadata:
  name: sample-app
spec:
  scaleTargetRef:
    # point the HPA at the sample application
    # you created above
    apiVersion: apps/v1
    kind: Deployment
    name: sample-app
  # autoscale between 1 and 10 replicas
  minReplicas: 1
  maxReplicas: 10
  metrics:
  # use a "Pods" metric, which takes the average of the
  # given metric across all pods controlled by the autoscaling target
  - type: Pods
    pods:
      # use the metric that you used above: pods/http_requests
      metric:
        name: http_requests
      # target 500 milli-requests per second,
      # which is 1 request every two seconds
      target:
        type: Value
        averageValue: 500m
```