Horizontal Pod Autoscaler (HPA) in Kubernetes

Bhargav Shah
5 min readSep 26, 2020

Overview of Kubernetes Horizontal Pod Autoscaler with example

Horizontal Pod Autoscaler:

The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or custom metrics). Note that Horizontal Pod Auto-scaling does not apply to objects that can’t be scaled, for example, DaemonSets.

The Horizontal Pod Autoscaler is implemented as a control loop, with a period controlled by the controller manager’s — horizontal-pod-autoscaler-sync-period flag (with a default value of 15s).

During each period, the controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition.

The autoscaler accesses corresponding scalable controllers (such as replication controllers, deployments, and replica sets) by using the scale sub-resource. Scale is an interface that allows you to dynamically set the number of replicas and examine each of their current states.

From the most basic perspective, the Horizontal Pod Autoscaler controller operates on the ratio between desired metric value and current metric value:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

For example, if the current metric value is 400m, and the desired value is 100m, the number of replicas will be four, since 400 / 100 == 4

Hands-on example:

・Installing metrics-server

As Horizontal Pod Autoscaler uses metrics-server aggregated APIs (metrics.k8s.io, custom.metrics.k8s.io, and external.metrics.k8s.io) to collect metrics, We need to deploy metrics-server in the cluster separately.

master $ wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml

Edit above file — add two arguments as mentioned below

Depending on your cluster setup, you may also need to change flags passed to the Metrics Server container. Most useful flags:

  • --kubelet-preferred-address-types - The priority of node address types used when determining an address for connecting to a particular node (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
  • --kubelet-insecure-tls - Do not verify the CA of serving certificates presented by Kubelets. For testing purposes only.
  • --requestheader-client-ca-file - Specify a root certificate bundle for verifying client certificates on incoming requests.
# run following command - This will create metrics-server
master $ k create -f components.yaml clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader createdclusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator createdrolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader createdapiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server createddeployment.apps/metrics-server createdservice/metrics-server createdclusterrole.rbac.authorization.k8s.io/system:metrics-server createdclusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created

Deploy our sample “php-apache” application

master $ kubectl apply -f https://k8s.io/examples/application/php-apache.yaml
deployment.apps/php-apache created
service/php-apache created

Create our first Horizontal Pod Autoscaler

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php-apache autoscaled

Let’s generate some load (cpu intensive load)

master $ kubectl run -it --rm load-generator --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ # while true; do wget -q -O- http://php-apache; done

Now give couple of minutes and you can see HPA in action,

HPA created replicas — Load per pod Before and After

Now stop the load generator and wait for 5mins (cooldown period)

We have successfully completed our HPA example 😃

The default values match the existing behavior in the HPA algorithm.

behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max

Getting errors?

“the HPA was unable to compute replica count: missing request for cpu”

Make sure you have mentioned resource limit in your specification

spec:
containers:
- name: php-apache
image: k8s.gcr.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m

unable to fetch pod metrics — x509: cannot validate certificate

Make sure you have passed “--kubelet-insecure-tls” argument

# components.yaml
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls

unable to fetch metrics from Kubelet node01 (node01): no such host

Make sure you have passed “--kubelet-preferred-address-types=InternalIP” argument

# components.yaml
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls

unable to get metrics for resource cpu: unable to (get pods.metrics.k8s.io)

Make sure you have metrics-server up and running

master $ wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml#Do the required changes (Add arguments)master $ kubectl create -f components.yamlmaster $ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
metrics-server-8499f4d68c-2fn9r 1/1 Running 0 4s

clean up time

master $ kubectl delete -f components.yaml
master $ kubectl delete -f https://k8s.io/examples/application/php-apache.yaml
master $ kubectl delete hpa php-apache

Thank you for reading 😃

--

--