-
Notifications
You must be signed in to change notification settings - Fork 766
Description
Describe the bug
My installation is using Istio Virtual Service as the traffic management layer for our Canary deployment.
The Canary status shows that traffic weight for the Canary is 0 when the Canary is in the WaitingPromotion
stage, however if you inspect the Virtual Service directly the configuration is still showing traffic weighting. I was able to confirm this behavior by polling the host and observing the returned data which contained info such as the currently deployed version of the application.
Below are the configurations of my testing environment with all non-service related configs redacted as well as identifying info. If you would like any extra details like autoscaler info I can provide that but I didn't think it was relevant.
Canary configuration
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: app
namespace: app-ns
spec:
analysis:
canaryReadyThreshold: 95
interval: 1m
maxWeight: 5
primaryReadyThreshold: 95
stepWeight: 5
stepWeightPromotion: 5
threshold: 3
webhooks:
- name: check gate for rollout
type: confirm-rollout
url: http://flagger-loadtester.flagger.svc.cluster.local/gate/check
- name: close gate during rollout
type: pre-rollout
url: http://flagger-loadtester.flagger.svc.cluster.local/gate/close
- name: check gate for promotion
type: confirm-promotion
url: http://flagger-loadtester.flagger.svc.cluster.local/gate/check
- name: rollback
type: rollback
url: http://flagger-loadtester.flagger.svc.cluster.local/rollback/check
- name: close rollout gate post promotion
type: post-rollout
url: http://flagger-loadtester.flagger.svc.cluster.local/gate/close
- name: close rollback gate post promotion
type: post-rollout
url: http://flagger-loadtester.flagger.svc.cluster.local/rollback/close
progressDeadlineSeconds: 600
service:
apex:
labels:
project: app
team: team
canary:
labels:
project: app
team: team
delegation: false
gateways:
- istio-system/gateway-public
hosts:
- <internal-hostname>
name: app
port: 8080
portDiscovery: false
portName: http
primary:
labels:
project: app
team: team
retries:
attempts: 2
perTryTimeout: 300ms
targetPort: 8080
timeout: 1s
trafficPolicy:
loadBalancer:
localityLbSetting:
enabled: true
simple: ROUND_ROBIN
outlierDetection:
baseEjectionTime: 30s
consecutive5xxErrors: 2
interval: 5s
maxEjectionPercent: 10
skipAnalysis: false
targetRef:
apiVersion: apps/v1
kind: Deployment
name: app
status:
canaryWeight: 0
conditions:
- lastTransitionTime: "2025-08-05T16:57:46Z"
lastUpdateTime: "2025-08-05T17:03:46Z"
message: Waiting for approval.
reason: WaitingPromotion
status: Unknown
type: Promoted
failedChecks: 0
iterations: -1
lastAppliedSpec: 9bfc9b9b
lastPromotedSpec: 65bb6456dd
lastTransitionTime: "2025-08-05T17:25:46Z"
phase: WaitingPromotion
trackedConfigs:
configmap/app-props: 90505591e475d818
secret/app-secret-props: 068541f56515a00b
Virtual Service config while Canary is in above state
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: app
namespace: app-ns
ownerReferences:
- apiVersion: flagger.app/v1beta1
blockOwnerDeletion: true
controller: true
kind: Canary
name: app
spec:
gateways:
- istio-system/gateway-public
hosts:
- <internal-hostname>
- app
http:
- retries:
attempts: 2
perTryTimeout: 300ms
route:
- destination:
host: app-primary
weight: 95
- destination:
host: app-canary
weight: 5
timeout: 1s
And finally the result of a kubectl get canary
command.
❯ kc get canary -n app
NAME STATUS WEIGHT LASTTRANSITIONTIME
app WaitingPromotion 0 2025-08-05T17:34:46Z
To Reproduce
- Create a Canary deployment using an Istio Virtual Service that has a
confirm-promotion
webhook step - Start a Canary rollout and wait for it to reach the
WaitingPromotion
step - Observe that the Canary status shows that the Weight is 0
- Observe that the Virtual Service traffic routing shows a 95:5 traffic split (or whatever you've configured) still
Expected behavior
The status should accurately display that the traffic weight is still set to 5% (or whatever the reality is)
Additional context
- Flagger version:
1.37.0
- Kubernetes version:
1.29
- Service Mesh provider: Istio
- Ingress provider: Istio
Lines 161 to 167 in 27daa2c
if phase != flaggerv1.CanaryPhaseProgressing && phase != flaggerv1.CanaryPhaseWaiting { | |
cdCopy.Status.CanaryWeight = 0 | |
cdCopy.Status.Iterations = 0 | |
if phase == flaggerv1.CanaryPhaseWaitingPromotion { | |
cdCopy.Status.Iterations = cd.GetAnalysis().Iterations - 1 | |
} | |
} |
I believe that the issue is this block of code here which sets the status canary weight to 0 statically if the phase of the canary is not Progressing or Waiting, which WaitingPromotion
is not either of those. I believe that this simply needs the WaitingPromotion
phase added to the if statement.