Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
300 changes: 300 additions & 0 deletions docs/compute/development/rfcs/configmap-secret-mounts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
---
status: proposed
---

# Mounting ConfigMaps and Secrets into Compute Instances (Unikraft Provider)

> Drafted 2026-05-30, revised 2026-05-31. **This is the foundational referenced-data delivery design for compute, and it ships before [Image Pull Credentials](./image-pull-credentials.md)** — it introduces the resolver, companion delivery, and the scheduling gate; pull secrets become a later consumer of the same path.

## Table of Contents

- [Summary](#summary)
- [What this enables for users](#what-this-enables-for-users)
- [End-to-end flow](#end-to-end-flow)
- [The gap: cross-plane delivery](#the-gap-cross-plane-delivery)
- [Design](#design)
- [The referenced-data resolver](#the-referenced-data-resolver)
- [Consumption on the provider](#consumption-on-the-provider)
- [Scheduling gate](#scheduling-gate)
- [Rotation and restart](#rotation-and-restart)
- [Platform direction](#platform-direction)
- [Security](#security)
- [Alternatives](#alternatives)
- [Failure modes](#failure-modes)
- [What gets built](#what-gets-built)
- [Decisions](#decisions)
- [Open questions](#open-questions)

---

## Summary

A compute `Workload` can already *describe* config and secret mounts: a volume
sourced from a ConfigMap or Secret, a container attachment with a mount path, and
environment variables that reference a key. The runtimes can already *consume* them —
the Unikraft runtime runs instances from Pod specs through its kubelet integration,
which honors ConfigMap/Secret references as both environment variables and volume
mounts, and the GCP provider mounts them as files too. So the API is real and the
runtimes support it.

The one thing missing is in the middle: **the referenced data never reaches the cell
where the instance runs.** It lives in the user's project; the instance runs on an
edge cell; federation propagates only the `WorkloadDeployment`. The instance comes up
referencing data that isn't there.

This RFC closes that gap. It keeps the user contract unchanged (create a
ConfigMap/Secret, reference it by name), resolves the reference in the trusted
management plane, and delivers the data to the edge as a derived companion object —
secret bytes **never enter the Workload or Instance spec**. Both environment
variables and file mounts work, because once the data is present the runtime's
existing Pod-spec consumption handles the rest.

## What this enables for users

Today users can only set literal environment variables, so configuration and
credentials get baked into images or pasted in as plaintext. After this:

- A user creates a `ConfigMap` and `Secret` in their project and references them
from the Workload; the platform delivers that data to every instance in every POP
cell, without the user ever knowing federation exists.
- **Both forms work** — keys surfaced as environment variables (the twelve-factor
case) and ConfigMaps/Secrets mounted as files at a path (config files,
certificates).
- **Secrets stay secret** — values never appear in the Workload or Instance the user
sees; they travel only as Secret objects.

## End-to-end flow

The decided design: management-plane resolution → companion object → federation →
cell → provider. The `WorkloadReconciler`, `ReferencedDataController`, and
`Federator` run in the management plane; the edge cell and the compute provider run
on the POP cell.

```mermaid
sequenceDiagram
actor User
participant P as Project plane
participant WC as WorkloadReconciler
participant RDC as ReferencedDataController
participant F as Federator
participant K as Karmada hub
participant C as Edge cell
participant PR as Compute provider
participant U as kraftlet / UKC

User->>P: 1. Create ConfigMap + Secret
User->>P: 2. Create Workload referencing them
Note over P: Admission check —<br/>author may read the referenced objects
WC->>P: 3. Create a WorkloadDeployment per placement
RDC->>P: 4. Read referenced ConfigMap/Secret (scoped, trusted)
RDC->>K: 5. Materialize a companion copy in the project's federation namespace
RDC->>P: 6. Record the expected companion set on the WorkloadDeployment
F->>K: 7. Replicate the WorkloadDeployment + routing policy
K->>C: 8. Propagate the deployment + companions to each matching cell
C->>C: 9. Create the Instance, held by a referenced-data gate
C->>C: 10. Companions present? clear the gate, mark data ready
PR->>C: 11. Translate the Instance into a Pod spec referencing the companions
Note over PR,U: Kubelet integration mounts the volumes and<br/>injects the env vars natively from the present data
PR->>U: 12. Launch the instance with config/secret applied
U-->>User: 13. Instance running with config/secret applied
```

## The gap: cross-plane delivery

The referenced ConfigMap/Secret lives in the user's project namespace; the instance
runs on an edge cell, possibly thousands of miles away. The federation channel
carries only the `WorkloadDeployment`, so the data has no path to the cell. There is
also no gate guarding this — the instance isn't even held back; it simply launches
with the reference unresolved.

Consumption is *not* a gap: the Unikraft runtime's kubelet integration already
resolves ConfigMap/Secret references in a Pod spec — env vars and volume mounts
alike — provided the referenced objects are present where it resolves them. So the
whole problem is getting the data to the cell, and faithfully carrying the
references through into the Pod spec the runtime consumes.

## Design

### The referenced-data resolver

A new management-plane controller is the heart of delivery. For each
WorkloadDeployment it:

1. **Collects** every ConfigMap/Secret the template references — environment
references and volume sources today; image pull secrets later.
2. **Reads** them with a scoped, trusted project-plane identity. The management plane
already has legitimate project access, so broad project-secret read never leaves
it.
3. **Materializes** one labeled companion per referenced object in the project's
federation namespace.
4. **Records** the expected companion set on the WorkloadDeployment, so the cell
knows exactly what to wait for rather than guessing.
5. **Routes** companions to cells by extending the existing federation routing policy
to carry the labeled companions alongside the deployment.

One companion exists per referenced object and is replicated to each placed cell — a
single object to create, update, and delete. When several deployments reference the
same object the companion is shared and reference-counted, removed only when the last
reference goes away. In single-cluster mode the same resolver runs and the companion
is simply a local copy.

### Consumption on the provider

Once the companions are present on the cell, the provider translates the Instance
into the Pod spec the runtime consumes — carrying the volume sources, volume mounts,
and environment references through faithfully and pointing them at the delivered
companions, with the referenced data present in the namespace the runtime resolves
from. The kubelet integration then mounts the volumes and injects the environment
variables natively; there is no provider-side inlining of secret values. (The
provider does not do this faithful translation today — it drops volumes and copies
only literal env values — so this is the provider-side work this RFC covers. The GCP
provider performs the equivalent translation already, which is why the same Workload
runs on either substrate.)

### Scheduling gate

An instance that references any ConfigMap/Secret is held by a **referenced-data
scheduling gate**, alongside the existing network and quota gates. The cell clears it
once exactly the expected companion set is present, and surfaces a
`ReferencedDataReady` status with clear reasons — resolving, awaiting propagation,
source not found, source unauthorized, source too large, or ready — backed by events
and metrics so a held instance is diagnosable, not a silent hang. The compute
provider must respect scheduling gates so an instance is never launched with its data
missing; this RFC adds that behavior.

### Rotation and restart

Decided: **no automatic roll; an explicit restart instead.** When a source changes,
the resolver re-reads it and refreshes the companion, so the latest values are staged
at the edge for the next instance launch. Running instances are not rolled
automatically — a fleet-wide restart on every edit is surprising, and a running
instance's environment isn't mutated in place regardless.

Compute already performs ordered, in-place rolling updates when a Workload's template
changes. The restart reuses that: a conventional restart annotation on the template
rolls the instances, which pick up the refreshed values — no new machinery. An
opt-in automatic roll on content change is a possible future addition, not part of
this RFC.

## Platform direction

The delivery half of this design — follow references, read them in the trusted
plane, materialize derived companions, route them to the cells where the resource is
placed, and signal readiness — is **not specific to compute**. It's a recurring
platform need: image pull credentials want the same thing next, and the network
operator already propagates derived Secrets/ConfigMaps to cells by label today. The
building blocks are already platform-level — the shared namespace-mapping and
downstream-delivery library, the label-based propagation pattern, and the
established policy-driven capabilities (quota, activity, insights) that a delivery
policy would sit naturally beside.

We deliberately **do not** build that generic capability now. With a single consumer
in hand the abstraction's seams aren't yet known, and a cross-cutting platform
capability would slow the first ship and widen the security review. Instead, this RFC
builds toward it on purpose:

- **Build the resolver in compute now, behind a narrow, capability-shaped
interface** — in: a subject, its set of referenced objects, and its placement
targets; out: companions delivered plus a readiness signal. It reuses the existing
platform delivery library rather than inventing its own placement and cleanup.
- **Keep delivery cleanly separable from consumption.** The scheduling gate and the
translation into the runtime's Pod spec stay in compute and depend only on the
readiness signal, so the delivery component carries no compute-specific knowledge.
- **Promote on the second consumer.** When a second user of this pattern appears
(image pull credentials, or another service), lift the delivery component into the
platform as a capability — most likely an admin-authored delivery policy that
declares, per resource kind, which references to follow and where to deliver them,
fitting the existing capability-policy pattern. Two real consumers is when the
abstraction can be shaped correctly.

This keeps compute shippable and autonomous today while making the design a
deliberate step toward a shared capability, not a one-off to untangle later. A
governance benefit falls out: when the policy lands, *what may be propagated, and
where* becomes an inspectable, access-controlled object rather than logic buried in a
controller.

## Security

- **Bytes never in user-visible specs.** The Workload and the Instance the user sees
carry references only. Values exist as Secret objects in the project's federation
namespace and on the cell where the runtime mounts them — never in anything
projected back to the user.
- **Companion Secrets stay Secrets** end to end; ConfigMap companions carry only
non-secret config. The runtime mounts the companions directly, so the provider
never has to inline secret values itself.
- **Authorization.** Admission verifies the submitting user can read each referenced
object — the same check already used for referenced Networks. A user cannot pull in
an object they couldn't read themselves; the resolver's system identity is never
the authority.
- **Trust boundary at the edge.** Resolving in the management plane is deliberate, so
the shared, lower-trust edge never holds a credential that can read project
ConfigMaps/Secrets. Companions are isolated per project namespace on each cell.
- **At rest.** Companions live in storage on the project plane, the hub, and each
cell; this presumes encryption at rest on every plane.

## Alternatives

- **Let the provider read the originals from the edge (no companions).** The leanest
option — it removes the resolver, companions, routing changes, and the data gate,
and it is how the GCP provider already works. **Rejected for secret bytes:** it
requires the shared edge to hold a credential that reads project ConfigMaps and
Secrets, exactly the trust boundary this design keeps in the management plane. (A
config-only hybrid was considered and rejected to avoid maintaining two delivery
paths.)
- **Inline resolved values into the Instance.** Rejected — leaks secret bytes into
storage everywhere and into what the user sees.
- **Propagate the user's original objects directly.** Rejected — couples cell
contents to arbitrary project objects and loses the scoping boundary.
- **A separate controller for pull secrets.** Rejected — same machinery; pull secrets
become a thin consumer of this resolver instead.

## Failure modes

- **Source missing, unauthorized, or too large** → gate held, status names the
offending object; optional sources are skipped.
- **Companion not yet on the cell** → gate held (awaiting propagation); a normal
transient state during placement.
- **Source changed, instances not rolled** → stale by design until restarted;
last-synced state is surfaced so it's observable.
- **Single-cluster mode** → the local-copy path must be exercised so the absence of
federation never silently disables delivery.

## What gets built

- A **referenced-data resolver** in the management plane: collect, read, materialize,
reference-count, and clean up companions.
- A **scoped project-plane read identity** for the resolver (built here; reused later
by image pull credentials).
- **Federation routing** extended to carry companions to the same cells as the
deployment.
- A **referenced-data scheduling gate**, cell-side clearing, and the
`ReferencedDataReady` status with reasons, events, and metrics.
- **API additions**: a bulk "import all keys" env form, and completing volume
validation (secret volumes, key→path selection, file mode).
- **Provider changes**: respect scheduling gates, and faithfully translate the
Instance's volumes, mounts, and env references into the Pod spec the runtime
consumes.
- A **restart** path (a conventional template annotation) so a rotated source can be
picked up on demand.

## Decisions

- **Delivery:** management-plane companions (not edge-read).
- **Rotation:** no auto-roll; explicit restart.
- **Gate contract:** an explicit expected-companion set recorded on the deployment,
not guessed.
- **One resolver, not two:** pull secrets are a later consumer.
- **Platform direction:** build delivery behind a capability-shaped seam in compute
now; promote it to a platform-owned, policy-driven capability when a second
consumer appears — not before.
- **Sequencing:** ships before image pull credentials; owns the scoped read identity
and provider gate-honoring.

## Open questions

1. **Scoped-read granularity:** can the resolver's project read be scoped to specific
object types or labels, or is it broad config/secret read?
2. **Companion size limits** and behavior when exceeded.
3. **Bulk env import in v1**, or per-key references only for the first release?
Comment thread
mattdjenkinson marked this conversation as resolved.
4. **VM runtime** consumption — out of scope for Unikraft (sandbox-only); confirm
deferral.
Loading