-
Notifications
You must be signed in to change notification settings - Fork 212
Open
Description
Problem
gotk_resource_info
metric has label of ready="Unknown"
for deamonset that takes ~1h to rollout.
This triggers prometheus alert for resource being in non-ready state for prolonged period of time, however it is being successfully rolled out. See:
status:
conditions:
- lastTransitionTime: "2025-05-23T14:30:17Z"
message: Running health checks for <redacted>
with a timeout of 1h0m0s
observedGeneration: 20
reason: Progressing
status: "True"
type: Reconciling
- lastTransitionTime: "2025-05-23T14:30:16Z"
message: Reconciliation in progress
observedGeneration: 20
reason: Progressing
status: Unknown
type: Ready
- lastTransitionTime: "2025-05-23T14:30:17Z"
message: Running health checks for revision <redacted>
with a timeout of 1h0m0s
observedGeneration: 20
reason: Progressing
status: Unknown
type: Healthy
Metric:
gotk_resource_info{
... <redacted> ...
customresource_group="kustomize.toolkit.fluxcd.io",
customresource_kind="Kustomization",
customresource_version="v1",
ready="Unknown",
suspended="false",
source_name="flux-system"
... <redacted> ...
}
Expected behaviour
Metric has label like ready="Progressing"
, this way alert can be configured to not alert on progressing resources.
Configuration
-
Kustomization:
apiVersion: kustomize.toolkit.fluxcd.io/v1 kind: Kustomization metadata: name: <redacted> namespace: <redacted> spec: interval: 10m serviceAccountName: kustomize-controller sourceRef: kind: GitRepository name: flux-system path: <redacted> prune: true wait: true suspend: false timeout: 60m dependsOn: <redacted>
-
Alertmanager alert1
- alert: FluxCDResourceNotReady expr: gotk_resource_info{ready!="True"} > 0 for: 15m
Footnotes
Metadata
Metadata
Assignees
Labels
No labels