Migrating to ambient mesh from Istio in sidecar mode
Overview
The Solo ambient mesh migration tool provides a prescriptive path for migrating from Istio’s sidecar mode to ambient mode. It is designed to provide a zero-downtime upgrade path, but can also translate Kubernetes manifests for users of upstream Istio.
While the tool will walk you through each step, having an understanding of ambient mode and its differences from sidecars will be helpful. We recommend going through the documentation to get an overview of the architecture and differences. In particular, understanding how authorization policy works is critical if you use authorization policies.
The tool does not manipulate any configuration in your cluster. It only reads configuration and suggests actions that you must take to adjust configuration. When following the procedures recommended by the tool and using a Solo.io build of Istio, a cluster can be migrated from sidecar mode to ambient without downtime. However, the tool cannot account for all possible cluster setups; operators are expected to review the suggestions and test them before acting on them.
Before you begin
Before you begin, install the gloo
CLI client, which contains the migration tool.
$ curl -sL https://storage.googleapis.com/gloo-cli/install.sh | GLOO_VERSION="v0.1.0-beta3" sh
export PATH=$HOME/.gloo/bin:$PATH
General usage
The tool will inspect the state of the cluster and make recommendations about next steps. Each phase of the migration runs a series of checks, and validates that the checks are completed before proceeding. If checks in a phase fail, the tool gives recommendations for how to resolve them. Details about each phase are documented below.
The basic tool operation involves running gloo ambient migrate
.
This will analyze the state of the cluster and recommend next steps.
There are a few useful options to consider:
--from-files
: Read configuration from YAML files instead of a live cluster, such as to test the migration first. When providing files, be sure to include the following APIs:- All Istio APIs
- All Gateway APIs
- Kubernetes APIs:
- Pod
- DaemonSet
- Deployment
- ValidatingWebhookConfiguration
- Namespace
- Service
- ConfigMap
- Secret (note: this is not required and can function without being included)
--ignore-failures
: Run the tool through each phase, regardless of whether checks fail in any phases. This can be helpful during testing if you are aware of failures you want to ignore.--output-dir
: Directory to output generated files to, such as the recommended changes to policies and waypoint configuration.
Migration phases
Prerequisites
The prerequisites phase inspects the cluster to ensure that it is ready for ambient mode.
This includes ensuring the Kubernetes environment is compatible, the current Istio version is compatible, and that features used by the existing Istio installation are compatible.
All existing istiod installations in the cluster must run Istio 1.25+. Note that zero downtime migration from sidecar mode to ambient mode requires using the free Solo.io builds of Istio.
An example cluster that is ready for ambient mode:
$ gloo migrate
• Starting phase pre-reqs...
✅ Phase pre-reqs succeeded!
✅ Cluster CNI compatibility passed
✅ Istio version compatibility passed
✅ Multicluster usage compatibility passed
✅ Virtual Machine usage compatibility passed
✅ SPIRE usage compatibility passed
If certain steps are not passed, the tool will give recommendations on how to resolve them.
If an upgrade is needed, you may consider enabling ambient in the same step. However, if you prefer, you may upgrade and then enable ambient later. The next phase will ensure ambient mode is enabled.
Cluster setup
The cluster setup phase ensures that your installation of Istio is successfully configured to support ambient mesh.
First, update your Istio installation to enable ambient mode. Follow the setup guide for more information.
This ensures Istio is installed with ambient mode, which enables you to start enrolling workloads into ambient mode, and enables the existing dataplanes (sidecar and gateways) to communicate with ambient workloads.
An example cluster that has not been fully configured for ambient mode:
$ gloo migrate
• Starting phase cluster-setup...
❌ Phase cluster-setup failed!
✅ Ambient mode is enabled passed
✅ DaemonSets are deployed passed
✅ Sidecars are updated passed
❌ Required CRDs are installed failed: Gateway API CRDs not found.
Waypoint deployment
Unlike with sidecar mode, ambient mesh allows you to choose which functionality you need. In particular, if a service requires rich HTTP based functionality, a waypoint proxy can be deployed.
In this phase, configuration in the cluster will be analyzed to determine which services require waypoints to be deployed to retain their existing functionality. For each recommendation, the configuration to deploy these waypoints will be provided.
The suggestions from the tool are a best-effort analysis based on the policies in the cluster. Review the recommended waypoint configurations carefully before deploying waypoints for best results. For example, the tool might detect customized load balancing for a service and recommend a waypoint, as waypoints are required to implement custom load balancing policies. However, you may decide that the value of this policy doesn’t justify the waypoint, and choose to skip the recommendation. If you aren’t sure, we recommend deploying the waypoint.
In some cases, the tool might not recommend a waypoint where you actually want apps to use one. Even without any policies configured, sidecars provide a variety of functionalities such as mTLS, HTTP observability, and HTTP request-level load balancing. While ztunnel provides mTLS and HTTP observability, it does not provide request-level load balancing. Applications requiring this can benefit from deploying a waypoint, even when the tool does not recommend one.
An example cluster where a single waypoint is recommended:
$ gloo migrate
• Starting phase deploy-waypoints...
⚠️ Phase deploy-waypoints has recommendations!
🔮 Namespace "application" may require a waypoint for the following services:
* Service "application/hello-world" depends on VirtualService "application/hello-world"
ℹ️ Generated waypoints are written to /tmp/istio-migrate/recommended-waypoints.yaml
The recommended-waypoints.yaml
file contains the recommended waypoint configuration:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: waypoint
namespace: application
spec:
gatewayClassName: istio-waypoint
listeners:
- name: mesh
port: 15008
protocol: HBONE
For more information about configuring waypoints, see Configure waypoint proxies.
Policy migration
At this point, you have all your waypoints deployed, but they are not fully configured or in use. The next step is to configure them. As in the previous step, traffic will still not be routed through the waypoint, which will occur in a later step once the waypoints are fully configured.
For policies that modify services, VirtualServices and DestinationRules, no changes are required. The tool copies the configuration and generates policies to be applied to waypoints. However, policies that modify specific workloads must be modified to apply to the waypoint.
The tool will automatically provide recommendations on how to apply the following resources to the waypoint:
- Telemetry
- WasmPlugin
- RequestAuthentication
- AuthorizationPolicy
- EnvoyFilter
The tool will also detect and warn against policies that are not supported in ambient mode.
An example in which a few policies are migrated to the waypoint, as well as a warning about an unsupported configuration:
$ gloo migrate
• Starting phase migrate-policies...
⚠️ Phase migrate-policies has recommendations!
⚠️ Sidecar/legacy/sidecar will be removed in a later step: Found unsupported configurations:
* Configuration scoping is detected. Typically, this is used for scalability purposes, which is not necessary with ambient.
🔮 Add AuthorizationPolicy/app1/echo-from-waypoint: Service/app1/echo must allow traffic from its waypoint
🔮 Apply AuthorizationPolicy/app1/echo-legacy-policies: Existing configuration is copied from policy app1/legacy-policies to be enforced at the waypoint.
🔮 Apply AuthorizationPolicy/app1/echo-policies: Existing configuration is copied from policy app1/policies to be enforced at the waypoint.
🔮 Apply AuthorizationPolicy/app1/waypoint-allow-nothing: Existing configuration is copied from namespace policy app1/allow-nothing to be enforced at the waypoint.
ℹ️ Recommended policies are written to /tmp/istio-migrate/recommended-policies.yaml
The recommended-policies.yaml
file contains the recommended policies that are generated. Note that at this phase of the process, you apply these waypoint-based policies in addition to the existing sidecar-based policies. Later phases help clean up the original policies once the migration is complete.
Authorization policy
AuthorizationPolicy in particular requires careful handling due to the changes in how policies are enforced in ambient mode. In ambient mode, Layer 4 and Layer 7 policies can be enforced against services, applied at the waypoint. Additionally, Layer 4 policies can be enforced against workloads.
This requires the following changes:
- If a workload policy contains HTTP attributes, a waypoint is required, as recommended in the previous phase. The policy will then need to be moved to configure the waypoint, rather than the workload.
- If a waypoint is deployed, the workload must allow traffic from the waypoint.
The tool will help move policies to the waypoints as required, and additionally generate policies to allow traffic from the waypoints.
Recommendation details
Note that, at this point, all policy recommendations are in addition to the existing policies. This ensures that during the migration process, policies continue to apply to all traffic requests, regardless of whether traffic is routed through a waypoint. Later phases will transition traffic to start flowing through the waypoint, and then optionally recommend cleaning up the unused policies once the migration is complete.
For example, consider a case where an authorization policy allows a single client to access an application:
Authorization policy allowing a single client to access an application
When a waypoint is deployed, the tool recommends additional policies to support the same behavior for traffic from the waypoint, in addition to traffic that does not go through the waypoint. This ensures a zero-downtime transition.
Authorization policy allowing access through a waypoint or directly
Waypoint enablement
Now that waypoints have been deployed and configured, you can start to use them. The tool will detect which waypoints you need to enable and give steps to enable them.
Once enabled, traffic will start to flow through the waypoint. This can be done on a service-by-service basis to allow you to monitor the health of the system as you progress. There is no need to quickly move everything at once. If a service is behaving unexpectedly, the waypoint enablement can be reverted. See the waypoint configuration documentation for more details.
An example cluster where one waypoint is ready for enablement:
$ gloo migrate
• Starting phase use-waypoints...
⚠️ Phase use-waypoints has recommendations!
⚠️ Warning: detected waypoint app1/waypoint but it is not used by anything
🔮 Detected namespace app1 requires a waypoint, but not configured to use one. Configure it with: kubectl label namespace app1 istio.io/use-waypoint=waypoint
Additional information
It is recommended that all waypoints are enabled before you start to remove sidecars from workloads. However, this is not strictly required, which may be useful in scenarios where some workloads will take a while to complete migration.
In order for all policies to be correctly applied, the request flow needs either:
- A sidecar on the client and server side
- A waypoint proxy enabled
This means that it is not suitable to simply remove sidecars from a namespace once that namespace has a waypoint enabled, because that namespace may still call other namespaces that do not yet have an enabled waypoint.
The following diagram illustrates how a client sidecar successfully communicates with a migrated namespace and with a namespace that is not yet migrated. If you were to remove the client sidecar before the unmigrated namespace is migrated, some policies would not be correctly applied to traffic requests.
Traffic flow during the migration
Policy simplification
At this point, all traffic will be flowing through waypoints. This means that policies can be simplified by removing any policies that are specific to sidecars, as they will no longer be relevant.
In the previous step, the tooling recommended copies of existing policies. The original sources of these can now be removed.
$ gloo migrate
• Starting phase policy-simplification...
⚠️ Phase policy-simplification has recommendations!
🔮 AuthorizationPolicy was migrated and can be deleted: kubectl delete authorizationpolicies.security.istio.io -n app1 allow-nothing
🔮 AuthorizationPolicy was migrated and can be deleted: kubectl delete authorizationpolicies.security.istio.io -n app1 legacy-policies
🔮 AuthorizationPolicy was migrated and can be deleted: kubectl delete authorizationpolicies.security.istio.io -n app1 policies
Sidecar removal
Now that all traffic has been configured to use waypoints, and the waypoints have been configured to apply equivalent configurations, it is safe to remove sidecars from workloads.
The sidecar removal phase will detect workloads still using sidecars and recommend moving to ambient mode:
$ gloo migrate
• Starting phase remove-sidecars...
⚠️ Phase remove-sidecars has recommendations!
🔮 namespace app1 can disable its sidecars: kubectl label ns app1 istio-injection- istio.io/dataplane-mode=ambient
⚠️ sidecar found for Pod app1/echo-7fb78cb7c5-plsjl