Why does CrashLoopBackOff occurs?
The CrashLoopBackOff
error can occur due to varies reasons, including:
- Insufficient resources — lack of resources prevents the container from loading
- Locked file/database/port — a resource already locked by another container
- No proper reference/Configuration — reference to scripts or binaries that are not present on the container or any misconfiguration on underlying system such as read-only filesystem
- Config loading/Setup error — a server cannot load the configuration file or initial setup like init-container failing
- Connection issues — DNS or kube-DNS is not able to connect to a external services
- Downstream service — One of the downstream services on which the application relies can’t be reached or the connection fails (database, backend, etc.)
- Liveness probes– Liveness probes could have misconfigured or probe fails due to any reason.
- Port already in use: Two or more containers are using the same port, which doesn’t work if they’re from the same Pod
How to Diagnosis CrashLoopBackOff
To troubleshoot any issues, the best way to identify the root cause is to start going through the list of potential causes and check one by one. Let’s say easy on first. Also, another basic requirement is having better understanding of the environment, like what is the configuration, what port it used, is there any mount point, what is the probe configured, etc.
Back Off Restarting Failed Container
For first point to troubleshoot to collect the issue details run kubectl describe pod [name]
. Let say you have configured and it is failing due to some reason like Liveness probe failed
and Back-off restarting failed container.
If you get the back-off restarting failed container
message this means that you are dealing with a temporary resource overload, as a result of an activity spike. The solution is to adjust periodSeconds
or timeoutSeconds
to give the application a longer window of time to respond.
Check the logs
If the previous step not providing any details or cannot identify, the next step will be pulling more details explanation about what is happening, you can get this from failing pod.
For that run kubectl get pods
to identify the pod that was exhibiting the CrashLoopBackOff
error. You can run the following command to get the log of the pod: