Summary
spec.tls.gateway.mode: Disabled causes the DocumentDB gateway sidecar to serve the Mongo wire protocol in plaintext, contradicting the public docs which promise that TLS is always on. This is a silent footgun — clients who trust status.connectionString (which always contains tls=true) are safe, but any attacker with on-cluster pod-network access can bypass TLS entirely by connecting with tls=false.
Recommend removing Disabled from the GatewayTLS.Mode enum so unencrypted traffic is impossible by construction.
Evidence of the contradiction
Docs promise encryption on all modes — docs/operator-public-documentation/preview/configuration/tls.md:43:
Disabled mode means the operator does not manage TLS certificates. However, the gateway still encrypts all connections using an internally generated self-signed certificate. Clients must connect with tls=true&tlsAllowInvalidCertificates=true.
E2E test proves plaintext is actually served — test/e2e/tests/tls/tls_disabled_test.go:17-19, 41-54:
// The gateway still listens but accepts plain-text mongo wire
// protocol. This spec verifies the happy-path: a freshly-created
// DocumentDB with TLS disabled accepts an unencrypted connection
// from the mongo driver.
...
client, err := mongohelper.NewClient(connectCtx, mongohelper.ClientOptions{
Host: host,
Port: port,
User: tlsCredentialUser,
Password: tlsCredentialPassword,
TLS: false,
})
...
Eventually(func() error {
return mongohelper.Ping(connectCtx, client)
}, ...).Should(Succeed(), "plaintext ping should succeed when TLS is disabled")
Ping succeeds, so the gateway is genuinely listening without TLS on port 10260.
Root cause
operator/cnpg-plugins/sidecar-injector/internal/lifecycle/lifecycle.go:188-232:
- When
gatewayTLSSecret parameter is absent (which is what Mode=Disabled produces — operator/src/internal/controller/certificate_controller.go:64-74 short-circuits the cert reconciler and leaves status.TLS.Ready=false), the sidecar is started without --cert-path/--key-file CLI args and without the CERT_PATH/KEY_FILE/TLS_CERT_DIR env vars.
- The upstream gateway binary then falls back to plaintext, not to self-generated self-signed. The docs' promise is wrong.
Why this matters
- Defense in depth violated. Mongo credentials travel on the pod network in the clear. Any compromised workload in the same cluster / same VNet can trivially credential-harvest.
- Docs encourage the belief that TLS is always on, so users selecting
Mode=Disabled for "dev" likely think they're getting a self-signed cert they can skip-verify. They're actually getting plaintext.
status.connectionString lies. It contains tls=true unconditionally (operator/src/internal/utils/util.go:423) regardless of Mode=Disabled. Clients pasting the published string work; clients who read the CR spec do not. The two contracts diverge silently.
- SCRAM-SHA-256 authentication over plaintext still leaks enough for offline brute force given a modern wordlist.
Proposed fix
Option A (recommended): remove unencrypted traffic as a possibility.
- Drop
Disabled from the GatewayTLS.Mode enum in operator/src/api/preview/documentdb_types.go:237 — change validation to Enum=SelfSigned;CertManager;Provided.
- Default an unset
Mode to SelfSigned so existing CRs that omit the field keep working (and keep encryption on).
- Update the cert controller's "empty or Disabled" branch in
operator/src/internal/controller/certificate_controller.go:64 to treat empty as SelfSigned.
- Remove
test/e2e/tests/tls/tls_disabled_test.go and test/e2e/manifests/mixins/tls_disabled.yaml.template.
- Remove the "Disabled" tab from
docs/operator-public-documentation/preview/configuration/tls.md.
- Note in CHANGELOG as a breaking change for pre-GA; migration path is "remove
mode: Disabled (or the whole tls: block) to get SelfSigned behavior."
Option B (fallback if Disabled must remain for some out-of-tree user): make it mean what the docs claim.
- Have the sidecar injector generate a self-signed cert in-cluster when
Mode=Disabled (or omitted) and mount it. This is effectively what Mode=SelfSigned already does, so at that point Disabled and SelfSigned are synonyms and Option A is cleaner.
Option A is strictly better because it removes the attack surface instead of relying on correct configuration.
Migration impact
- API change pre-GA (
preview apiVersion), so the usual "no breaking changes" bar doesn't apply.
- Users with
mode: Disabled get a behavior change (plaintext → self-signed TLS). Since the documented contract was already "TLS is always on", this aligns behavior with the documented contract.
Out of scope
- Same audit should confirm the Postgres wire protocol inside the pod is TLS'd between the gateway sidecar and the Postgres container. The gateway is launched with
--pg-port 5432 against 127.0.0.1, so same-pod IPC over loopback — lower risk but worth a follow-up issue.
Companion doc bug
Independent of this fix, docs/operator-public-documentation/preview/configuration/tls.md:43 must be corrected to match reality. If Option A is taken, the section is removed entirely.
Summary
spec.tls.gateway.mode: Disabledcauses the DocumentDB gateway sidecar to serve the Mongo wire protocol in plaintext, contradicting the public docs which promise that TLS is always on. This is a silent footgun — clients who truststatus.connectionString(which always containstls=true) are safe, but any attacker with on-cluster pod-network access can bypass TLS entirely by connecting withtls=false.Recommend removing
Disabledfrom theGatewayTLS.Modeenum so unencrypted traffic is impossible by construction.Evidence of the contradiction
Docs promise encryption on all modes —
docs/operator-public-documentation/preview/configuration/tls.md:43:E2E test proves plaintext is actually served —
test/e2e/tests/tls/tls_disabled_test.go:17-19, 41-54:Ping succeeds, so the gateway is genuinely listening without TLS on port 10260.
Root cause
operator/cnpg-plugins/sidecar-injector/internal/lifecycle/lifecycle.go:188-232:gatewayTLSSecretparameter is absent (which is whatMode=Disabledproduces —operator/src/internal/controller/certificate_controller.go:64-74short-circuits the cert reconciler and leavesstatus.TLS.Ready=false), the sidecar is started without--cert-path/--key-fileCLI args and without theCERT_PATH/KEY_FILE/TLS_CERT_DIRenv vars.Why this matters
Mode=Disabledfor "dev" likely think they're getting a self-signed cert they can skip-verify. They're actually getting plaintext.status.connectionStringlies. It containstls=trueunconditionally (operator/src/internal/utils/util.go:423) regardless ofMode=Disabled. Clients pasting the published string work; clients who read the CR spec do not. The two contracts diverge silently.Proposed fix
Option A (recommended): remove unencrypted traffic as a possibility.
Disabledfrom theGatewayTLS.Modeenum inoperator/src/api/preview/documentdb_types.go:237— change validation toEnum=SelfSigned;CertManager;Provided.ModetoSelfSignedso existing CRs that omit the field keep working (and keep encryption on).operator/src/internal/controller/certificate_controller.go:64to treat empty as SelfSigned.test/e2e/tests/tls/tls_disabled_test.goandtest/e2e/manifests/mixins/tls_disabled.yaml.template.docs/operator-public-documentation/preview/configuration/tls.md.mode: Disabled(or the wholetls:block) to get SelfSigned behavior."Option B (fallback if Disabled must remain for some out-of-tree user): make it mean what the docs claim.
Mode=Disabled(or omitted) and mount it. This is effectively whatMode=SelfSignedalready does, so at that point Disabled and SelfSigned are synonyms and Option A is cleaner.Option A is strictly better because it removes the attack surface instead of relying on correct configuration.
Migration impact
previewapiVersion), so the usual "no breaking changes" bar doesn't apply.mode: Disabledget a behavior change (plaintext → self-signed TLS). Since the documented contract was already "TLS is always on", this aligns behavior with the documented contract.Out of scope
--pg-port 5432against127.0.0.1, so same-pod IPC over loopback — lower risk but worth a follow-up issue.Companion doc bug
Independent of this fix,
docs/operator-public-documentation/preview/configuration/tls.md:43must be corrected to match reality. If Option A is taken, the section is removed entirely.