If a Flexera Kubernetes monitor pod is stuck in a pending state it will show a "pending" status and never proceed to "running" status, as shown in the example below.
If the monitor pod is stalled, you can begin troubleshooting by using the describe command on the pod, then looking for an event that describes an error or block to the pod's progress. The most common reason for a stalled monitor pod to occur relates to the allocation of the PVC (PersistantVolumeClaims). Below is an example of a stalled monitor pod due to the PVC.
kubectl get persistentvolumeclaims -n flexera
kubectl describe persistentvolumeclaims -n flexera flexera-krm-data-krm-instance-monitor-0
An example output
In the example above, the PVC specification is misconfigured with a storage class that doesn't exist ("TYPO_ERROR"). Typos in a storage class name can commonly cause this issue. Another common error occurs when using a storage class that doesn't support dynamic provisioning.
The number of possible errors here is vast, and the process for correcting the issue depends on the specific error. Engage with your Kubernetes platform team to help troubleshoot the error or consult your platform provider's documentation for assistance.
kubectl delete persistentvolumeclaims -n flexera flexera-krm-data-krm-instance-monitor-0
NOTE: Typically, changing the PVC specification in the KRM resource will cause the StatefulSet to be rebuilt. This results in a rebuild of the PVC and a restart of all the pods using the new configuration. However, in the case of a stalled PVC, it is common to see the StatefulSet fail to rebuild the PVC using the new specification. This is a Kubernetes issue that may be resolved in the future. As a workaround, you can delete the PVC before applying the updated storage specification.
on Jan 19, 2023 03:49 PM - edited on Mar 10, 2023 12:37 PM by HollyM