Load balancing

Documentation

Load balancing

When ambient mesh is enabled, traffic will be load balanced between backends by the ztunnel proxy. More complex load balancing can be configured using a waypoint proxy.

Load balancing in the secure overlay

The ztunnel proxy automatically performs client-side load balancing if the destination is a service with multiple endpoints. No additional configuration is needed. The load balancing algorithm is an internally-fixed L4 Round Robin algorithm that distributes traffic based on L4 connection state, and is not user configurable.

If the destination is a service with multiple instances or pods and there is no waypoint associated with the destination service, then the source ztunnel performs L4 load balancing directly across these instances or service backends and then sends traffic via the remote ztunnel proxies associated with those backends. If the destination service is configured to use one or more waypoint proxies, then the source ztunnel proxy performs load balancing by distributing traffic across these waypoint proxies and sends traffic via the remote ztunnel proxies on the node hosting the waypoint proxy instances.

Traffic distribution

By default, ztunnel will consider all endpoints with equal weight when selecting a backend. This can be configured to prefer closer endpoints (taking into consideration region, zone, and cluster).

This can be enabled on a per-service basis by setting spec.trafficDistribution=PreferClose on the Service. For ServiceEntry types, or on older Kubernetes clusters without access to the (relatively new) trafficDistribution field, the networking.istio.io/traffic-distribution: PreferClose annotation can be set.

By default, traffic to waypoint proxies will always use PreferClose.

Controlling load balancing with waypoint proxies

Advanced load balancing features are supported when a workload is enrolled in the waypoint layer.

By default, a waypoint will distribute traffic across each service’s load balancing pool using a least requests model, where each request is routed to the host with fewer active requests from a random selection of two hosts from the pool. In this way the most heavily loaded host will not receive requests until it is no more loaded than any other host.

Waypoints also supports more advanced models, which you can specify in destination rules for requests to a particular service or service subset.

Random: Requests are forwarded at random to instances in the pool.
Weighted: Requests are forwarded to instances in the pool according to a specific percentage.
Round robin: Requests are forwarded to each instance in sequence.
Consistent hash: Provides soft session affinity based on HTTP headers, cookies or other properties
Ring hash: Implements consistent hashing to upstream hosts using the Ketama algorithm
Maglev: Implements consistent hashing to upstream hosts as described in the Maglev paper

See the Istio load balancing documentation and the Envoy load balancing documentation for more information about these options and how to configure them.

To favor endpoints based on topological location, Locality Load Balancing is also available.

Gateways Traffic splitting