feat: add hbone_idle_timeout field to MeshConfig API #3611
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add configurable idle timeout for HBONE connections between proxies and ztunnel to address stale connection reuse when pod IPs are recycled.
This is particularly critical in environments with aggressive IP address reuse, such as AWS EKS with VPC CNI (default 30s cooldown period). Without an explicit idle timeout, Envoy defaults to 1 hour, causing proxies to reuse stale connections from connection pools when target pod IPs are recycled, resulting in 503 errors and upstream reset failures.
The new hbone_idle_timeout field in MeshConfig allows operators to configure the idle timeout appropriately for their environment. For AWS VPC CNI, a value of 15 seconds is recommended.
See: istio/istio#58389