I was faced with a failed Helm release.
emilyzall@Emilys-MBP datadog-agent % helm history datadog-agent -n datadog
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
1 Thu May 25 09:45:40 2023 deployed datadog-3.25.1 7 Install complete
2 Tue Jun 13 09:10:18 2023 failed datadog-3.25.1 7 Upgrade "datadog-agent" failed: timed out waiting for the condition
3 Tue Jun 13 09:30:23 2023 failed datadog-3.25.1 7 Release "datadog-agent" failed: timed out waiting for the condition
I want to see if this error is still happening. Being new to Flux CD at first I thought that it might retry this failed Helm release automatically as part of the sync. This didn’t seem to be the case though based on the last date in the history above.
Maybe I could force it to retry the Helm Release using the flux reconcile command. It sounded good at the time!
emilyzall@Emilys-MBP datadog-agent % flux reconcile
The reconcile sub-commands trigger a reconciliation of sources and resources.
So…
emilyzall@Emilys-MBP datadog-agent % flux reconcile helmrelease datadog-agent -n datadog --verbose
► annotating HelmRelease datadog-agent in datadog namespace
✔ HelmRelease annotated
◎ waiting for HelmRelease reconciliation
✗ HelmRelease reconciliation failed: Helm rollback failed: release datadog-agent failed: timed out waiting for the condition
Oh, I guess it’s still failing. But wait why is it timing out within seconds, that seems mighty quick. I found that the default timeout is 5 minutes and my HelmRelease resource was configured with a 20 minute time out. So (???). I searched for something like “flux reconcile helmrelease” and found https://stackoverflow.com/questions/65677606/is-there-a-way-to-manually-retry-a-helmrelease-for-fluxcd-helmoperator.
There is a way to manually retry a Helm Release with Flux but it is not flux reconcile helmrelease. Actually you have to do flux suspend then flux resume.
emilyzall@Emilys-MBP datadog-agent % flux suspend helmrelease datadog-agent -n datadog --verbose
► suspending helmrelease datadog-agent in datadog namespace
✔ helmrelease suspended
emilyzall@Emilys-MBP datadog-agent % flux resume helmrelease datadog-agent -n datadog --verbose
► resuming helmrelease datadog-agent in datadog namespace
✔ helmrelease resumed
◎ waiting for HelmRelease reconciliation
✔ HelmRelease reconciliation completed
✔ applied revision 3.25.1
Pingback: Troubleshooting with Crossplane and FluxCD… could be better | Fault Tolerant