Health Checks
In order to verify if a container in a pod is healthy and ready to serve traffic, Kubernetes provides for a range of health checking mechanisms. Health checks, or probes as they are called in Kubernetes, are carried out by the kubelet to determine when to restart a container (liveness probes) and used by services and deployments to determine if a pod should receive traffic (readiness probes).
We will focus on HTTP health checks in the following. Note that it is the responsibility of the application developer to expose a URL that the kubelet can use to determine if the container is healthy (and potentially ready).
Let’s create a pod that exposes an endpoint /health
, responding with a HTTP 200
status code:
$ kubectl apply -f https://github.com/openshift-evangelists/kbe/raw/main/specs/healthz/pod.yaml
In the pod specification we’ve defined the following:
livenessProbe:
initialDelaySeconds: 2
periodSeconds: 5
httpGet:
path: /health
port: 9876
The configuration above tells Kubernetes to start checking the /health
endpoint, after initially waiting 2 seconds, every 5 seconds.
If we now look at the pod we can see that it is considered healthy:
$ kubectl describe pod hc
The following (truncated) output shows the relevant sections:
Name: hc
Namespace: default
Priority: 0
Node: minikube/192.168.39.51
...
Containers:
sise:
Container ID: docker://2cfe4187808a89ae4731abfe242ac42611e1f658505691f540ac31ca8f6ce86f
Image: quay.io/openshiftlabs/simpleservice:0.5.0
...
Ready: True
Restart Count: 0
Liveness: http-get http://:9876/health delay=2s timeout=1s period=5s #success=1 #failure=3
Environment: <none>
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
...
Now launch a bad pod which will randomly (in the time range 1 to 4 sec) not return a 200
code:
$ kubectl apply -f https://github.com/openshift-evangelists/kbe/raw/main/specs/healthz/badpod.yaml
Looking at the events of the bad pod, we can see that the health check failed:
$ kubectl describe pod badpod
In particular, look at the events section at the bottom:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 24s default-scheduler Successfully assigned default/badpod to minikube
Normal Pulled 22s kubelet Container image "quay.io/openshiftlabs/simpleservice:0.5.0" already present on machine
Normal Created 22s kubelet Created container sise
Normal Started 22s kubelet Started container sise
Warning Unhealthy 9s (x3 over 19s) kubelet Liveness probe failed: Get "http://172.17.0.4:9876/health": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Normal Killing 9s kubelet Container sise failed liveness probe, will be restarted
This can also be verified with the get subcommand:
$ kubectl get pods
Notice that badpod has been restarted multiple times because of the failing health checks.
NAME READY STATUS RESTARTS AGE
badpod 1/1 Running 2 109s
hc 1/1 Running 0 11m
In addition to a liveness probe, you can also specify a readiness probe. Readiness probes are configured in the same way, but have different use cases and semantics. The readiness probe indicates when the application itself is running and able to receive traffic.
Let’s create a pod with a readiness probe that reports success after 10 seconds:
$ kubectl apply -f https://github.com/openshift-evangelists/kbe/raw/main/specs/healthz/ready.yaml
Looking at the events of the pod, we can see that, eventually, the pod is ready to serve traffic:
$ kubectl describe pod ready
Depending on how quickly you ran the describe command, you may have noticed the pod reflected that it was not ready to receive traffic:
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
You can remove all of the created pods with:
$ kubectl delete pod/hc pod/ready pod/badpod
Learn more about configuring probes, including TCP and command probes, in the documentation.