Flux has the unfortunate habit of having a helmrelease fail with the error: “timed out waiting for the condition” without providing any details about what condition it was waiting for.
Here’s a couple things to try:
- Look for unhealthy pods and check their logs – somewhat frequently this works, however not always because it might be waiting for a resource other than the pods or it might be timing out on deleting something
- Check flux logs for errors. There should be info here but I haven’t necessarily found them that useful for troubleshooting. Don’t get too caught up in an error that could be a red herring
flux logs --all-namespaces --level=error - Delete the helmrelease. This is what I had to do today. I had removed several Kustomizations that should have led to the helmrelease being deleted but for some reason flux was timing out on that. When I deleted the helmrelease using
kubectl deletethough, it remained gone and seems to have cleaned up the resources it had been using - Disable wait on the helmrelease – you can disable the health checks to view the failing resources. This is recommended on https://fluxcd.io/flux/cheatsheets/troubleshooting/ but I would only use it as a last resort. The health check is usually there for a reason so removing it could lead to more messiness.
# USE WITH CAUTION, may lead to instability
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: podinfo
namespace: default
spec:
install:
disableWait: true
upgrade:
disableWait: true
Pingback: Troubleshooting with Crossplane and FluxCD… could be better | Fault Tolerant