Horizontal Pod Autoscaler (HPA) in Kubernetes
Overview of Kubernetes Horizontal Pod Autoscaler with example
Horizontal Pod Autoscaler:
The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or custom metrics). Note that Horizontal Pod Auto-scaling does not apply to objects that can’t be scaled, for example, DaemonSets.
The Horizontal Pod Autoscaler is implemented as a control loop, with a period controlled by the controller manager’s — horizontal-pod-autoscaler-sync-period flag (with a default value of 15s).
During each period, the controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition.
The autoscaler accesses corresponding scalable controllers (such as replication controllers, deployments, and replica sets) by using the scale sub-resource. Scale is an interface that allows you to dynamically set the number of replicas and examine each of their current states.
From the most basic perspective, the Horizontal Pod Autoscaler controller operates on the ratio between desired metric value and current metric value:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
For example, if the current metric value is 400m
, and the desired value is 100m
, the number of replicas will be four, since 400 / 100 == 4
Hands-on example:
・Installing metrics-server
As Horizontal Pod Autoscaler uses metrics-server aggregated APIs (metrics.k8s.io, custom.metrics.k8s.io, and external.metrics.k8s.io) to collect metrics, We need to deploy metrics-server in the cluster separately.
master $ wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml
Edit above file — add two arguments as mentioned below
Depending on your cluster setup, you may also need to change flags passed to the Metrics Server container. Most useful flags:
--kubelet-preferred-address-types
- The priority of node address types used when determining an address for connecting to a particular node (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])--kubelet-insecure-tls
- Do not verify the CA of serving certificates presented by Kubelets. For testing purposes only.--requestheader-client-ca-file
- Specify a root certificate bundle for verifying client certificates on incoming requests.
# run following command - This will create metrics-server
master $ k create -f components.yaml clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader createdclusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator createdrolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader createdapiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server createddeployment.apps/metrics-server createdservice/metrics-server createdclusterrole.rbac.authorization.k8s.io/system:metrics-server createdclusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
Deploy our sample “php-apache” application
master $ kubectl apply -f https://k8s.io/examples/application/php-apache.yaml
deployment.apps/php-apache created
service/php-apache created
Create our first Horizontal Pod Autoscaler
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php-apache autoscaled
Let’s generate some load (cpu intensive load)
master $ kubectl run -it --rm load-generator --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ # while true; do wget -q -O- http://php-apache; done
Now give couple of minutes and you can see HPA in action,
HPA created replicas — Load per pod Before and After
Now stop the load generator and wait for 5mins (cooldown period)
We have successfully completed our HPA example 😃
The default values match the existing behavior in the HPA algorithm.
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
Getting errors?
“the HPA was unable to compute replica count: missing request for cpu”
Make sure you have mentioned resource limit in your specification
spec:
containers:
- name: php-apache
image: k8s.gcr.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m
unable to fetch pod metrics — x509: cannot validate certificate
Make sure you have passed “--kubelet-insecure-tls” argument
# components.yaml
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
unable to fetch metrics from Kubelet node01 (node01): no such host
Make sure you have passed “--kubelet-preferred-address-types=InternalIP” argument
# components.yaml
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
unable to get metrics for resource cpu: unable to (get pods.metrics.k8s.io)
Make sure you have metrics-server up and running
master $ wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml#Do the required changes (Add arguments)master $ kubectl create -f components.yamlmaster $ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
metrics-server-8499f4d68c-2fn9r 1/1 Running 0 4s
clean up time
master $ kubectl delete -f components.yaml
master $ kubectl delete -f https://k8s.io/examples/application/php-apache.yaml
master $ kubectl delete hpa php-apache