06. Exercise: Debugging
Exercise: Debugging
As with coding itself, once you have launched your app with Kubernetes, it's likely you will need to do some debugging to get everything working properly. Here, you'll do some debugging with an example app to build your skills with Kubernetes.
Instructions - Pod Issues
Let's say you have deployed a Kubernetes app, but have the pod does not seem to be running.
- First, use
kubectl get podsto check the names of your running pods. You may notice the pod with an issue is shown as in aPendingstatus instead ofRunning. - Using the
NAMEof the specific pod from step 1, usekubectl describe pod {POD NAME}to get more information about that pod. - From the output of the above command, search until you find the
Eventsheader. This should give you aReasonandMessagerelated to the failure, such asFailedScheduling. An issue like this could be due to the necessary resources not being available for the pod, such as CPU limits. - From what we have seen before,
kubectl scalecould be used in such a situation to correctly scale up and provide the necessary resources for ourPendingpod. On the next page, you'll get to see an automated way to scale up your apps which improves on the manual functionality ofkubectl scale.
Instructions - Node Issues
In this case, consider a Kubernetes app where the pod is working, but behaving strangely. Alternateively, you may have noticed an issue where no pod will schedule onto a particular node. In this case, there is likely an issue with the specific node that needs to be debugged. While the overall process is fairly similar to debugging issues, the syntax of commands is slightly different, so let's walk through these.
- First, use
kubectl get nodesto check the names of the available nodes. You may notice the node with an issue is shown as in aNotReadystatus instead ofReady. - Using the
NAMEof the specific node from step 1, usekubectl describe node {NODE NAME}to get more information about that node. - The outputs here can vary quite a bit, but the issue could be caused by a disconnection from the network, some other negative
Event, too high of resource usage, etc.