feat(spanner): add shared endpoint cooldowns for location-aware rerouting#12845
feat(spanner): add shared endpoint cooldowns for location-aware rerouting#12845rahul2393 wants to merge 6 commits intogoogleapis:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces an endpoint cooldown mechanism to handle RESOURCE_EXHAUSTED errors and refactors the KeyRangeCache to use immutable snapshots, replacing per-group locking to improve read performance. The new EndpointOverloadCooldownTracker manages short-lived cooldowns with exponential backoff and jitter, while KeyAwareChannel is updated to exclude endpoints on both RESOURCE_EXHAUSTED and UNAVAILABLE status codes. Feedback is provided to optimize the GroupSnapshot constructor by removing a redundant list copy.
| private GroupSnapshot(ByteString generation, int leaderIndex, List<TabletSnapshot> tablets) { | ||
| this.generation = generation; | ||
| this.leaderIndex = leaderIndex; | ||
| this.tablets = Collections.unmodifiableList(new ArrayList<>(tablets)); | ||
| } |
There was a problem hiding this comment.
The GroupSnapshot constructor performs a redundant copy of the tablets list. Since the only caller (CachedGroup.update) already creates a new ArrayList, we can wrap it directly in an unmodifiable list to avoid unnecessary allocations.
| private GroupSnapshot(ByteString generation, int leaderIndex, List<TabletSnapshot> tablets) { | |
| this.generation = generation; | |
| this.leaderIndex = leaderIndex; | |
| this.tablets = Collections.unmodifiableList(new ArrayList<>(tablets)); | |
| } | |
| private GroupSnapshot(ByteString generation, int leaderIndex, List<TabletSnapshot> tablets) { | |
| this.generation = generation; | |
| this.leaderIndex = leaderIndex; | |
| this.tablets = Collections.unmodifiableList(tablets); | |
| } |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements latency-aware routing for Spanner endpoints by introducing a score-based replica selection mechanism using time-decayed EWMA. Key additions include registries for tracking endpoint latency and inflight requests, a cooldown tracker for overloaded endpoints, and updates to the KeyRangeCache to support score-aware selection. Feedback identifies several high-priority issues in the new static registries, including a memory leak in the latency tracker map due to accumulating operation identifiers, potential key collisions between different client instances sharing a JVM, and a race condition when updating inflight request counts. There is also a recommendation to reduce the maximum size of the request ID cache to prevent excessive memory consumption.
| private static final ConcurrentHashMap<TrackerKey, LatencyTracker> TRACKERS = | ||
| new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
The TRACKERS map is a static ConcurrentHashMap that is never cleared. Since it is keyed by operationUid (which is per-range/group and can change over time) and address, this will lead to a memory leak in long-running applications as old operationUid entries accumulate indefinitely. Consider using a Cache with a maximum size and/or time-based expiration (e.g., com.google.common.cache.Cache with expireAfterAccess) to bound memory usage.
References
- For safety, use bounded resources with a defined maximum size to prevent resource exhaustion.
| static final class TrackerKey { | ||
| private final long operationUid; | ||
| private final String address; |
There was a problem hiding this comment.
Using only operationUid and address as a key in a static registry poses a collision risk if multiple Spanner client instances (e.g., targeting different projects or instances) are used within the same JVM. While operationUid is intended to be unique, it is likely only scoped to a specific database or instance context. Consider including a more globally unique identifier, such as a databaseId or an instance-specific prefix, in the TrackerKey to ensure isolation between different clients.
| int updated = counter.decrementAndGet(); | ||
| if (updated <= 0) { | ||
| INFLIGHT_REQUESTS.remove(address, counter); | ||
| } |
There was a problem hiding this comment.
There is a race condition in finishRequest. INFLIGHT_REQUESTS.remove(address, counter) can remove the AtomicInteger object even if another thread has just incremented it via beginRequest (since beginRequest uses computeIfAbsent which returns the existing object). This leads to lost inflight counts. Since the number of Spanner endpoints is relatively small and stable, it is safer to keep the AtomicInteger in the map indefinitely rather than attempting to remove it when it reaches zero.
| int updated = counter.decrementAndGet(); | |
| if (updated <= 0) { | |
| INFLIGHT_REQUESTS.remove(address, counter); | |
| } | |
| counter.decrementAndGet(); |
|
|
||
| private static final Cache<String, RoutingTarget> TARGETS = | ||
| CacheBuilder.newBuilder() | ||
| .maximumSize(100_000_000L) |
There was a problem hiding this comment.
The maximumSize of 100,000,000 for TARGETS is excessively large and could lead to significant memory consumption if the cache fills up. Since entries are typically removed upon call closure in HeaderInterceptor and have a 10-minute expiration, a much smaller limit (e.g., 1,000,000) should be more than sufficient for even the most high-throughput applications while providing better safety against memory exhaustion.
| .maximumSize(100_000_000L) | |
| .maximumSize(1_000_000L) |
References
- For safety, use a defined maximum size for bounded resources to prevent memory exhaustion.
Summary
This PR improves Java Spanner's location-aware bypass routing when routed replicas are overloaded or unavailable, and extends score-based replica selection
The client now:
operation_uid is available
It also keeps the location-aware read path lock-free via immutable group snapshots.
What changed
returning to the same failed endpoint.