Using the untaint controller
If Kubernetes can start pods on a node before the istio-cni
agent has configured node networking, then those pods will not correctly have traffic redirection configured. This can lead to a short period where traffic is not controlled by Istio and can bypass any configured policy.
In order to avoid this race condition, you can take advantage of node taints. New pods will not be scheduled until the taint is removed by Istio’s untaint controller.
Configure Istio
Install Istio with the following values to enable the untaint controller:
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
values:
pilot:
taint:
enabled: true
env:
PILOT_ENABLE_NODE_UNTAINT_CONTROLLERS: "true"
Certain environments may require istio-cni
to be installed in a different namespace to istiod
. You can specify the namespace to watch by setting the pilot.taint.namespace value:
spec:
values:
pilot:
taint:
enabled: true
namespace: kube-system
Creating your nodes
Configure your node deployment (node pool, auto-scaling group, CI template etc) to add the cni.istio.io/not-ready
taint to nodes when they are created. This is sometimes called a startup taint. For example, when using Karpenter:
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
spec:
template:
metadata:
labels:
billing-team: my-team
spec:
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
startupTaints:
- key: cni.istio.io/not-ready
effect: NoSchedule
In Google Kubernetes Engine, you can specify node taints with the --node-taints
flag on cluster or node pool creation.
This taint will mean that no pods can be scheduled onto the node unless they tolerate the cni.istio.io/not-ready
taint. (System add-ons, such as the istio-cni agent itself, are usually configured to tolerate all taints.) When the agent starts, the untaint controller will remove this taint from the nodes, and then pods can be scheduled,
Debugging the untaint controller
The untaint controller runs as part of istiod. You can see the status of the controller by connecting to a debug page on the istiod instance:
$ kubectl port-forward deployment/istiod -n istio-system 8080:8080
Navigate to http://localhost:8080/debug/krtz?pretty.
You can see that the untaint controller is running, and look at its state; specifically, node-untaint/nodes
shows the status of nodes, and node-untaint/ready-cni-nodes
shows the CNI agents which are ready.
At the default info
log level, you can see the untaint controller start:
$ kubectl logs -f deployment/istiod -n istio-system
2025-07-18T01:23:00.952072Z info krt node-untaint/nodes synced owner=node-untaint/nodes
2025-07-18T01:23:00.952085Z info krt node-untaint/pods synced owner=node-untaint/pods
2025-07-18T01:23:00.954215Z info krt node-untaint/cni-pods synced owner=node-untaint/cni-pods
2025-07-18T01:23:00.956745Z info controllers starting controller=untaint nodes
2025-07-18T01:23:00.956785Z info krt node-untaint/ready-cni-nodes synced owner=node-untaint/ready-cni-nodes
You can set the untaint controller to debug
to see events as nodes are created:
$ istioctl admin log --level untaint:debug
If you add the taint to a node, the untaint controller will notice and remove it:
$ kubectl taint nodes ambient-worker2 cni.istio.io/not-ready:NoSchedule
2025-07-18T01:35:11.698525Z debug untaint adding node to queue event: ambient-worker2
2025-07-18T01:35:11.698838Z debug untaint reconciling node ambient-worker2
2025-07-18T01:35:11.698855Z debug untaint removing readiness taint from node ambient-worker2
2025-07-18T01:35:11.705994Z debug untaint removed readiness taint from node ambient-worker2
Restore the log level to default:
$ istioctl admin log --level untaint:info