Fix CachedResource virtual-storage discovery on multi-replica/embedded-VW shards by mjudeikis · Pull Request #4190 · kcp-dev/kcp

mjudeikis · 2026-06-04T13:48:19Z

Summary

CachedResource-backed (virtual-storage) resources were intermittently undiscoverable and unservable on a kcp shard running with multiple replicas and/or embedded virtual workspaces. kubectl get <resource> returned:

Warning: Some resources are temporarily unavailable: [<resource>.<group>].
error: the server doesn't have a resource type "<resource>"

and the list/get failed. Whether it failed flapped depending on which shard replica served the request. There are two independent root causes.

1. DynamicRESTMapper controllers only ran on the leader replica

The builtin and dynamic REST mapper controllers populate a per-process, in-memory DynamicRESTMapper that every shard replica's own API server reads — the aggregating CRD-version discovery server and the virtual resources server use it to resolve GroupKinds (e.g. cache.kcp.io/CachedResourceEndpointSlice) to resources. They were registered under the leader-elected kcp-start-controllers hook, so on a shard with more than one replica only the leader populated its mapper. Non-leader replicas served requests against an empty mapper, so RESTMapping for system types failed (no matches for kind "CachedResourceEndpointSlice" in group "cache.kcp.io") and every virtual-storage resource was dropped from discovery there.

Both controllers only mutate the in-memory mapper and never write to storage, so they are safe to run on every replica. They are now started from a dedicated post-start hook that is not gated by leader election, guarded by a sync.Once so a leadership re-acquisition does not re-wire them.

2. Virtual workspace forward client used the loopback SNI/identity with embedded virtual workspaces

The aggregating discovery and virtual resources servers forward requests to the virtual workspace endpoint URLs advertised in CachedResourceEndpointSlice. Each such URL is the serving shard's own external, SNI-routed virtual-workspace URL (Shard.Spec.VirtualWorkspaceURL, e.g. https://<shard>.example:8443/services/replication/...) — reached through the ingress/front-proxy, not the in-cluster Service address and not loopback, even when the virtual workspace runs embedded in that shard.

vwClientConfig was only given a usable TLS config (shard client cert + VW CA) when virtual workspaces were not embedded. With embedded virtual workspaces (the default), it kept the loopback client config, whose ServerName is the loopback name and whose bearer token is only valid on the loopback listener. The forwarded request's TLS SNI was therefore the loopback name, the ingress could not SNI-route the connection, and it was reset:

failed to perform API discovery: ... read: connection reset by peer

The forward client now always authenticates with the shard client certificate, drops the loopback bearer token, and leaves ServerName empty so the SNI is derived from the target host. When --shard-virtual-workspace-ca-file is unset (typical with embedded virtual workspaces) it falls back to the root CA, which signs the shard serving certificate.

What Type of PR Is This?

/kind bug

Related Issue(s)

Fixes #

Release Notes

Fix CachedResource-backed (virtual-storage) resources being intermittently undiscoverable or unservable on kcp shards running with multiple replicas or embedded virtual workspaces.

mjudeikis · 2026-06-04T13:49:56Z

/test pull-kcp-build-image

mjudeikis · 2026-06-04T14:18:17Z

/build-image

mjudeikis · 2026-06-04T15:09:22Z

/build-image

mjudeikis · 2026-06-05T03:59:02Z

/build-image

mjudeikis · 2026-06-05T07:01:52Z

/build-image

mjudeikis · 2026-06-05T07:09:07Z

/build-image

mjudeikis · 2026-06-05T07:47:14Z

/build-image

ntnn · 2026-06-05T09:05:52Z

+	// The DynamicRESTMapper is per-process, in-memory state that THIS shard replica's own
+	// API server reads (the aggregating CRD-version discovery server and the virtual resources
+	// server) to resolve GroupKinds to resources. Both controllers only mutate that in-memory
+	// mapper and never write to storage, so — unlike the leader-elected controllers started by
+	// kcp-start-controllers — they MUST run on every replica. If they only ran on the leader, a
+	// non-leader replica would serve requests against an empty mapper, making built-in/system
+	// types such as cache.kcp.io/CachedResourceEndpointSlice unresolvable and breaking discovery
+	// and serving of virtual-storage (CachedResource-backed) resources on that replica. Start
+	// them from a post-start hook that is not gated by leader election.
+	return s.AddPostStartHook("kcp-start-dynamicrestmapper-controllers", func(hookContext genericapiserver.PostStartHookContext) error {
+		controllerCtx := klog.NewContext(ctx, klog.FromContext(ctx).WithValues("postStartHook", "kcp-start-dynamicrestmapper-controllers"))
+		for _, controller := range controllers {
+			go s.runController(controllerCtx, controller)
+		}
+		return nil
 	})
 }


Would it make sense to move the dynrest controller out to separate between authoritative and information controllers? If we have more controllers with the same error pattern it would make sense to separate the two.

Agreed it would be cleaner to formalize the "authoritative (leader-elected) vs informational (per-replica, in-memory)" split once there is more than one such controller. To keep this PR scoped to the bug I left it as a single dedicated post-start hook for now; happy to file a follow-up issue for the abstraction (or fold it in here if you'd prefer).

Now non-ai comment:
will do in follow-up :D

ntnn · 2026-06-05T09:06:30Z

+	return s.AddPostStartHook("kcp-start-dynamicrestmapper-controllers", func(hookContext genericapiserver.PostStartHookContext) error {
+		controllerCtx := klog.NewContext(ctx, klog.FromContext(ctx).WithValues("postStartHook", "kcp-start-dynamicrestmapper-controllers"))
+		for _, controller := range controllers {
+			go s.runController(controllerCtx, controller)


Suggested change

go s.runController(controllerCtx, controller)

go s.runController(hookContext, controller)

This was correct, no? Just the parent of controllerCtx was wrong and should be hookCtx.

Ah true, yeah

Done — controllerCtx now uses hookContext as its parent (and derives the logger from it), and we keep passing controllerCtx to runController, matching @gman0's point.

The DynamicRESTMapper is per-process, in-memory state that each shard replica's own API server reads — the aggregating CRD-version discovery server and the virtual resources server use it to resolve GroupKinds (e.g. cache.kcp.io/CachedResourceEndpointSlice) to resources. Its builtin and dynamic controllers were registered through the leader-elected kcp-start-controllers hook, so on a multi-replica shard only the leader populated its mapper. Non-leader replicas served API traffic against an empty mapper, so the REST mapping for system types failed there and virtual-storage (CachedResource-backed) resources became undiscoverable and unservable whenever a request landed on a non-leader replica — observed as intermittent "Some resources are temporarily unavailable" and "no matches for kind ... in group cache.kcp.io". Both controllers only mutate the in-memory mapper and never write to storage, so they are safe to run everywhere. Start them from a dedicated post-start hook that is not gated by leader election, guarded by a sync.Once so the wiring happens exactly once per process even though installControllers can run again on leadership re-acquisition.

The aggregating CRD-version discovery server and the virtual resources server forward requests to the virtual workspace endpoint URLs advertised in CachedResourceEndpointSlices. Each such URL is the serving shard's own external virtual-workspace URL (Shard.Spec.VirtualWorkspaceURL), reached through the cluster ingress that routes by TLS SNI directly to that shard — even when virtual workspaces run embedded in the shard. vwClientConfig was only given a usable TLS config (shard client cert + VW CA) when virtual workspaces were NOT embedded. With embedded virtual workspaces (the default) it kept the loopback client config, whose ServerName is the loopback name and whose bearer token is only valid on the loopback listener. The forwarded request's TLS SNI was therefore the loopback name, the ingress could not route it, and the connection was reset ("failed to perform API discovery: ... read: connection reset by peer"), so every virtual-storage (CachedResource- backed) resource was dropped from discovery and could not be served. When the shard is configured with a client certificate, authenticate the forward client with it, leave ServerName empty so the SNI is derived from the target host, and drop the loopback bearer token; fall back to the root CA (which signs the shard serving cert) when --shard-virtual-workspace-ca-file is unset. Without a shard client certificate (e.g. an in-process test server) the endpoint URL points back at this same server, so the loopback config is kept as-is.

ntnn

/lgtm
/approve

kcp-ci-bot · 2026-06-05T09:41:36Z

LGTM label has been added.

Details

Git tree hash: 5c73f0ab08327c24e6a1206bfaf5d9052a633632

kcp-ci-bot · 2026-06-05T09:41:38Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ntnn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [ntnn]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kcp-ci-bot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. dco-signoff: yes Indicates the PR's author has signed the DCO. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 4, 2026

mjudeikis commented Jun 5, 2026

View reviewed changes

Comment thread pkg/server/config.go Outdated

mjudeikis force-pushed the fix/dynamicrestmapper-builtin-established-race branch from e0763e5 to 3b3206f Compare June 5, 2026 07:41

mjudeikis changed the title ~~Fix dynamicrestmapper builtin controller dropping not-yet-established…~~ Fix CachedResource virtual-storage discovery on multi-replica/embedded-VW shards Jun 5, 2026

kcp-ci-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jun 5, 2026

mjudeikis force-pushed the fix/dynamicrestmapper-builtin-established-race branch from 3b3206f to 4370ab9 Compare June 5, 2026 08:18

ntnn reviewed Jun 5, 2026

View reviewed changes

mjudeikis added 2 commits June 5, 2026 12:33

mjudeikis force-pushed the fix/dynamicrestmapper-builtin-established-race branch from 4370ab9 to 41a22b9 Compare June 5, 2026 09:33

ntnn approved these changes Jun 5, 2026

View reviewed changes

kcp-ci-bot assigned ntnn Jun 5, 2026

kcp-ci-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 5, 2026

kcp-ci-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2026

kcp-ci-bot merged commit fb8629a into main Jun 5, 2026
14 checks passed

kcp-ci-bot deleted the fix/dynamicrestmapper-builtin-established-race branch June 5, 2026 10:19

	go s.runController(controllerCtx, controller)
	go s.runController(hookContext, controller)

Conversation

mjudeikis commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. DynamicRESTMapper controllers only ran on the leader replica

2. Virtual workspace forward client used the loopback SNI/identity with embedded virtual workspaces

What Type of PR Is This?

Related Issue(s)

Release Notes

Uh oh!

mjudeikis commented Jun 4, 2026

Uh oh!

mjudeikis commented Jun 4, 2026

Uh oh!

mjudeikis commented Jun 4, 2026

Uh oh!

mjudeikis commented Jun 5, 2026

Uh oh!

mjudeikis commented Jun 5, 2026

Uh oh!

Uh oh!

mjudeikis commented Jun 5, 2026

Uh oh!

mjudeikis commented Jun 5, 2026

Uh oh!

Uh oh!

ntnn Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

mjudeikis Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

mjudeikis Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ntnn Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

gman0 Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

ntnn Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

mjudeikis Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

ntnn left a comment

Choose a reason for hiding this comment

Uh oh!

kcp-ci-bot commented Jun 5, 2026

Uh oh!

kcp-ci-bot commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mjudeikis commented Jun 4, 2026 •

edited

Loading