Request timeouts

Request timeouts

Request timeouts are supported at the edge of a cluster using a gateway, or when a workload is enrolled in the waypoint layer.

When a client encounters latency in an upstream microservice, it can wait indefinitely, causing itself to become unavailable, and thus propagating failure throughout the network. This problem can be mitigated with request timeouts, where the client severs the connection after a set period.

In this guide you will learn how to set up request timeouts in your ambient mesh.


Set up a cluster

You should have a running Kubernetes cluster with Istio installed in ambient mode. Ensure your default namespace is added to the ambient mesh:

$ kubectl label ns default

Deploy a waypoint

Request timeouts are a Layer 7 feature, applied to HTTP requests, and therefore require the use of waypoints.

If you don’t already have a waypoint installed for the default namespace, install one:

$ istioctl waypoint apply -n default --enroll-namespace --wait

For more information on using waypoints, see Configuring waypoint proxies.

Configure request timeouts in ambient mesh

Deploy sample services

To test request timeouts, you will deploy a service, httpbin, and a client, curl.

$ kubectl apply -f
$ kubectl apply -f

Test latency

The httpbin application has an /delay/{delay} endpoint, which simulates a response delay of the length requested.

To simulate an endpoint with a 2-second delay:

$ kubectl exec deploy/curl -- curl httpbin:8000/delay/2

In the output, confirm that “Time Total” and “Time Spent” show a value of 2 seconds (0:00:02):

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   430  100   430    0     0    213      0  0:00:02  0:00:02 --:--:--   213

Configure a request timeout

Configure calls to httpbin with a 500ms timeout:

$ kubectl apply -f - <<EOF
kind: HTTPRoute
  name: httpbin
  - group: ""
    kind: Service
    name: httpbin
    port: 8000
  - backendRefs:
    - name: httpbin
      port: 8000
      request: 500ms
The Gateway API also defines an optional backendRequest timeout duration. Consult the documentation for the specific meaning of each timeout field.

Verify the timeout

Make another call to httpbin:

$ kubectl exec deploy/curl -- curl -v httpbin:8000/delay/2

Observe the 504 “Gateway Timeout” response in the output:

* IPv6: (none)
* IPv4:
*   Trying
* Connected to httpbin ( port 8000
* using HTTP/1.x
> GET /delay/2 HTTP/1.1
> Host: httpbin:8000
> User-Agent: curl/8.11.0
> Accept: */*
* Request completely sent off
< HTTP/1.1 504 Gateway Timeout
< content-length: 24
< content-type: text/plain
< date: Wed, 04 Dec 2024 17:47:01 GMT
< server: istio-envoy
< x-envoy-decorator-operation: httpbin.default.svc.cluster.local:8000/*
* Connection #0 to host httpbin left intact
upstream request timeout

You can confirm the timeout using the time command:

$ time kubectl exec deploy/curl -- curl -v httpbin:8000/delay/2

The executed time will be slightly above the 500ms timeout value set.

Executed in  618.05 millis...

Clean up

Delete the HTTPRoute:

$ kubectl delete httproute httpbin

Deprovision the sample applications:

$ kubectl delete -f
$ kubectl delete -f

Tips on configuring timeouts

Is it important to first remove existing timeouts from my applications, or can I just overlay mesh timeouts on top of them as they are?
It is simpler to remove the logic from your applications, and consistently maintain all your resilience configuration in the mesh layer. That said, leaving them there doesn't necessarily do harm. It's important to reason about which timeout "wins." For example, an application-level timeout of 100ms will preempt a 200ms timeout configured through Istio.
What should I set my timeout to?

In Understanding Distributed Systems, author Robert Vitillo suggests setting timeouts to the 99.9% percentile latency of a service. That is, if the client is waiting longer than the time it normally takes for 99.9% of requests to respond, then sever the connection.

This is based on an acceptable false timeout rate of 0.1% of requests, where you are willing to accept 1 in every 1000 requests will timeout erroneously.

You can use the observability features of ambient mesh to determine this latency number.