Skip to content

Add Gateway API (HTTPRoute) support to Console Helm chart and CRD#1329

Draft
david-yu wants to merge 1 commit into
mainfrom
1308-add-gateway-support-to-console-helm-chart
Draft

Add Gateway API (HTTPRoute) support to Console Helm chart and CRD#1329
david-yu wants to merge 1 commit into
mainfrom
1308-add-gateway-support-to-console-helm-chart

Conversation

@david-yu
Copy link
Copy Markdown
Contributor

@david-yu david-yu commented Mar 20, 2026

Summary

Adds support for Kubernetes Gateway API HTTPRoute resources to the Console chart and operator, allowing users to expose Console via Gateway API controllers (e.g. Envoy Gateway, Istio, Cilium) as an alternative to classic Ingress.

Closes #1308
Supersedes #1309 (recreated from upstream branch for CI)

Original chart work by @w1ndhunter.

Prerequisites

Gateway API CRDs must be installed in your cluster before enabling this feature. The CRDs are not bundled with the Redpanda Helm chart or operator — they are maintained by the Kubernetes Gateway API project.

Install Gateway API CRDs

kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.5.1/standard-install.yaml

Install a Gateway controller

You also need a Gateway API-compatible controller running in your cluster. Common options include:

Create a Gateway resource that the HTTPRoute will attach to:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: my-gateway
  namespace: gateway-system
spec:
  gatewayClassName: eg  # varies by controller
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
      tls:
        mode: Terminate
        certificateRefs:
          - name: my-tls-cert

Changes

Console Helm Chart

  • New gateway values block alongside existing ingress block
  • New gateway.go with HTTPRoute() rendering function
  • HTTPRoute added to Render() manifest list and Types() scheme registration
  • gatewayv1.Install(Scheme) for Gateway API type serialization
  • Updated notes.go to show Gateway URLs (mutually exclusive with Ingress)
  • Validation in NewRenderState rejects both gateway and ingress enabled simultaneously
  • New test cases: gateway-only, mutual exclusion validation, gateway removal/switch scenarios
  • Regenerated schema, golden files, and templates

Operator (Console CRD)

  • New GatewayConfig and GatewayParentReference types in console_types.go
  • Gateway field added to ConsoleValues and RedpandaConsole structs
  • Auto-generated conversion (goverter) from CRD types → chart partial values
  • RBAC: added gateway.networking.k8s.io/httproutes permissions
  • Registered gatewayv1 types in V2 scheme
  • Bumped sigs.k8s.io/gateway-api from v1.4.1 → v1.5.1
  • Regenerated deepcopy, conversion, and CRD manifests
  • HTTPRoute watch is skipped gracefully when Gateway API CRDs are not installed

Console Controller Tests — Gateway API CRD Loading ⚠️

Reviewer note: The console controller test (operator/internal/controller/console/controller_test.go) now loads Gateway API CRDs from the sigs.k8s.io/gateway-api Go module cache at test time. This is needed because envtest must have the HTTPRoute CRD installed for the controller to reconcile gateway-enabled Console objects (list HTTPRoutes, create/delete them).

The loadGatewayAPICRDs helper resolves the module directory via go list -m -f {{.Dir}} sigs.k8s.io/gateway-api, reads all standard CRD YAMLs from config/crd/standard/, and parses them. Non-CRD files (e.g. ValidatingAdmissionPolicy) are skipped gracefully.

This approach avoids vendoring the ~7K-line HTTPRoute CRD YAML into testdata while keeping the CRD version in sync with go.mod.

Bug Fix: InUseServerCerts external TLS cert mounting

  • Fixed InUseServerCerts in charts/redpanda/values.go (and operator/multicluster/values.go) where a continue statement on the internal listener's TLS check would skip registering external sub-listener certs. External certs are now checked independently of internal TLS state.

Usage Examples

Helm Chart — Gateway API

# 1. Install Gateway API CRDs (if not already installed)
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.5.1/standard-install.yaml

# 2. Install Console with gateway enabled
helm install console redpanda/console -f values.yaml
# values.yaml
gateway:
  enabled: true
  annotations:
    example.com/owner: my-team
  parentRefs:
    - name: my-gateway
      namespace: gateway-system
      sectionName: https
  hostnames:
    - console.example.com
  path: /
  pathType: PathPrefix

Helm Chart — Classic Ingress (unchanged)

# values.yaml
ingress:
  enabled: true
  hosts:
    - host: console.example.com
      paths:
        - path: /
          pathType: Prefix

Note: Enabling both ingress.enabled: true and gateway.enabled: true will fail with: ingress and gateway cannot both be enabled; use one or the other

Console CRD (Operator) — Gateway API

# 1. Install Gateway API CRDs (if not already installed)
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.5.1/standard-install.yaml

# 2. Apply the Console CR with gateway enabled
kubectl apply -f console.yaml
# console.yaml
apiVersion: cluster.redpanda.com/v1alpha2
kind: Console
metadata:
  name: my-console
  namespace: redpanda
spec:
  clusterSource:
    clusterRef:
      name: my-cluster
  gateway:
    enabled: true
    annotations:
      example.com/owner: my-team
    parentRefs:
      - name: my-gateway
        namespace: gateway-system
        sectionName: https
    hostnames:
      - console.example.com
    path: /
    pathType: PathPrefix

Console CRD (Operator) — Switching from Gateway to Ingress

To switch, remove the gateway stanza and add ingress:

apiVersion: cluster.redpanda.com/v1alpha2
kind: Console
metadata:
  name: my-console
  namespace: redpanda
spec:
  clusterSource:
    clusterRef:
      name: my-cluster
  ingress:
    enabled: true
    hosts:
      - host: console.example.com
        paths:
          - path: /
            pathType: Prefix

The operator will remove the HTTPRoute and create an Ingress instead.

Test plan

  • go build ./... passes in operator/ and charts/console/
  • go test ./... passes in charts/console/ — includes:
    • TestIngressGatewayMutualExclusion — both enabled → error
    • TestGatewayRemoval/gateway_removed_from_config — no gateway stanza → no HTTPRoute
    • TestGatewayRemoval/gateway_explicitly_disabledenabled: false → no HTTPRoute
    • TestGatewayRemoval/switch_from_gateway_to_ingress — gateway→ingress produces Ingress, no HTTPRoute
    • TestTemplate golden tests (gateway-templating case)
  • Console controller tests pass with Gateway API CRDs loaded into envtest
    • TestController/gateway-enabled — basic gateway with parentRefs
    • TestController/gateway-custom-path — custom path and multiple parentRefs
  • CRD schema includes gateway field with correct OpenAPI validation
  • Deploy Console CRD with gateway.enabled: true → HTTPRoute created
  • Deploy via Helm with gateway.enabled: true → HTTPRoute created
  • Enabling both ingress and gateway → validation error

🤖 Generated with Claude Code

Comment thread charts/console/chart/notes.go
Comment thread pkg/multicluster/raft_test.go
Comment thread charts/console/render.go Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 5, 2026

This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions Bot added the stale label Apr 5, 2026
@david-yu david-yu removed the stale label Apr 6, 2026
@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions Bot added the stale label Apr 12, 2026
@david-yu david-yu removed the stale label Apr 14, 2026
@david-yu david-yu marked this pull request as draft April 14, 2026 17:17
@david-yu
Copy link
Copy Markdown
Contributor Author

Moving to draft for now

@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions Bot added the stale label Apr 20, 2026
@david-yu david-yu removed the stale label Apr 23, 2026
@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions Bot added the stale label May 5, 2026
@david-yu david-yu removed the stale label May 5, 2026
@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions Bot added the stale label May 11, 2026
@david-yu david-yu removed the stale label May 11, 2026
@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions Bot added the stale label May 17, 2026
@david-yu david-yu removed the stale label May 18, 2026
@david-yu
Copy link
Copy Markdown
Contributor Author

End-to-end test on EKS 1.34 in tandem with PR #1447

TL;DR: Provisioned a fresh EKS 1.34 cluster, deployed both PRs on a single integration branch (test/1329-1447-integration), ran a k6 mixed-load test against this PR's Console HTTPRoute alongside a 10 Mbps OMB Kafka workload through PR #1447's TLSRoute on the same Envoy Gateway. Console HTTPRoute renders correctly, attaches, terminates TLS, and serves 26,745 requests over 12.5 min with zero 5xx errors and p95 < 8 ms end-to-end.

Stack

Piece Value
EKS 1.34
Region us-west-2
Operator image 605419575229.dkr.ecr.us-west-2.amazonaws.com/redpanda-operator-pr1329-1447:f6083b79-v4 (PR #1329 + PR #1447 merged on main)
Gateway Envoy Gateway v1.2.6, TLS:9094 (Passthrough — Kafka) + HTTPS:443 (Terminate — Console) on the same redpanda-gateway Gateway
Console bundled via the Redpanda chart's console subchart (v3.7.0), gateway block enabled
HTTPS cert cert-manager self-signed, Secret console-gateway-tls referenced via certificateRefs on the Gateway listener

Console route — what got rendered

Enabled the chart-side Console gateway integration with these values:

console:
  enabled: true
  gateway:
    enabled: true
    parentRefs:
      - name: redpanda-gateway
        namespace: redpanda
        sectionName: console-https
    hostnames:
      - console.example.com
    path: /
    pathType: PathPrefix

The chart rendered exactly one HTTPRoute (rp-console) attaching to the console-https listener:

$ kubectl -n redpanda get httproute
NAME         HOSTNAMES                 AGE
rp-console   ["console.example.com"]   24s

$ kubectl -n redpanda get httproute rp-console -o jsonpath='{.status.parents[*].conditions[?(@.type=="Accepted")].status}'
True

$ kubectl -n redpanda get gateway redpanda-gateway -o json | jq '.status.listeners[]'
  kafka: attachedRoutes=4         # PR #1447 TLSRoutes
  console-https: attachedRoutes=1 # this PR's HTTPRoute

k6 workload (mixed UI + API)

Bucket Weight Routes
UI / overview ~50% GET /, GET /api/cluster/overview
Topic browse ~30% GET /api/topics, GET /api/topics/{name}/messages?count=100
Schema / cgroup ~20% GET /api/schemas, GET /api/consumer-groups
  • 20 VUs, 2 min ramp + 10 min steady (matches OMB's window for clean overlap).
  • 12m30s total, 26,745 iterations, 35.65 RPS sustained.

k6 summary

Metric Value
Total requests 26,745
Avg RPS 35.65 req/s
http_req_duration p50 1.93 ms
http_req_duration p90 5.70 ms
http_req_duration p95 7.99 ms
http_req_duration avg 4.46 ms
5xx count 0
console_errors rate (5xx-only) 0.00%
http_req_failed rate (any non-2xx incl. 4xx) 50.27% [*]
Data received 37 MB
checks pass rate 100.00% (26,745 / 26,745)

[*] The 50% non-2xx rate is the GET /api/topics/mqtt-0/messages?count=100 request returning 404 because the topic OMB created has a different name (UUID suffix on the prefix). This still exercises the HTTPRoute routing path — the custom console_errors Rate metric (5xx-only) is the real "did the Gateway / Console fail" signal and it stayed at 0% throughout.

Per-bucket latency (k6 custom Trends)

Bucket avg p50 (med) p90 p95 max
console_ui_duration (UI + cluster overview) 1.93 ms 1.89 ms 2.05 ms 2.11 ms 12.12 ms
console_topics_duration (topic browse) 3.01 ms 2.24 ms 4.38 ms 4.59 ms 42.81 ms
console_other_duration (schemas + consumer groups) 13.03 ms 4.16 ms 9.52 ms 10.33 ms 4.05 s

The console_other_duration max is an outlier (likely a transient Console upstream); the p95 is still under 11 ms. UI + topic-browse buckets are both sub-5 ms at p95 — essentially negligible Gateway / TLS-termination overhead.

Tandem isolation check

Both routes attached and served traffic concurrently throughout the run. Console k6 sustained 35.65 RPS with 0 5xx; concurrent 10 Mbps Kafka through the sibling TLSRoute did not degrade Console latency (p95 stayed at 7.99 ms for the full window). See the PR #1447 tandem comment for Kafka-side metrics from the same run.

What this validates for PR #1329

  • ✅ The chart's console.gateway block renders a single HTTPRoute with correct parentRefs, hostnames, path, and pathType.
  • ✅ The HTTPRoute attaches successfully to a Gateway listener referenced by parentRefs[].sectionName.
  • ✅ The chart's console.gateway block coexists cleanly with PR charts/redpanda: Gateway API TLSRoute support for external access #1447's per-listener gateway-mode TLSRoutes on the same Gateway resource. No allowedRoutes.kinds overlap (TLSRoute vs HTTPRoute are distinct kinds), and the Gateway status correctly partitions attached routes per listener (kafka: attachedRoutes=4, console-https: attachedRoutes=1).
  • ✅ cert-manager self-signed cert + certificateRefs on the listener is sufficient for the chart side; no additional Console-side TLS plumbing needed.
  • ✅ Routing works end-to-end through Envoy's HTTPS termination → Console upstream HTTP. 26,745 requests over 12.5 min, 0 5xx, p95=7.99 ms.

Note on the operator (CRD) path of this PR: I also attempted the v2 operator path (Console CR with spec.gateway) but hit a reconcile-time error because the operator's Console controller currently requires a Redpanda CR (Redpanda.cluster.redpanda.com "redpanda" not found), and my test deploys Redpanda via the Helm chart directly (no Redpanda CR). The chart-side validation above is independent and covers the HTTPRoute rendering logic in this PR. A separate validation pass with the operator-managed Redpanda CR would round out the v2 path; happy to add if useful.

Reproducible artifacts

All manifests + the k6 script are committed in eks-api-gateway-tls/ of the test harness repo. The additions for this run:

  • manifests/07-console-gateway-listener.yaml — HTTPS sibling listener on the same Gateway.
  • manifests/08-console-cert.yaml — cert-manager Issuer + Certificate.
  • manifests/10-k6-console-load.yaml — in-cluster k6 Job + script ConfigMap.

Raw artifacts (in test-harness repo)

  • results/2026-05-19-tandem/k6-stdout.log — full k6 run output incl. the summary panel.
  • results/2026-05-19-tandem/omb-results.json — concurrent OMB run for the tandem context.

🤖 Generated with Claude Code

david-yu added a commit that referenced this pull request May 19, 2026
The earlier shape — registering only the chart's lightweight TLSRoute
kind via AddKnownTypeWithName — left the v1alpha2 ListOptions / List
kinds missing from the operator's scheme. controller-runtime's
reflector calls List on every Watch, so every reconcile pass that
included TLSRoute resources errored with:

  Failed to watch *redpanda.TLSRoute:
  no kind "ListOptions" is registered for version
  "gateway.networking.k8s.io/v1alpha2"

Fix: switch the chart's Types() entry to the upstream
`gatewayv1alpha2.TLSRoute` and register `gatewayv1alpha2.Install` on
the operator's v2 scheme. The chart still renders the same wire bytes
via the lightweight struct returned by TLSRoutes(); only the type the
controller cache binds to changes.

Surfaced during the tandem PR #1329 + #1447 e2e on EKS 1.34. See:
#1447 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@david-yu david-yu force-pushed the 1308-add-gateway-support-to-console-helm-chart branch from d9c1cc4 to 2298c12 Compare May 19, 2026 17:41
david-yu added a commit that referenced this pull request May 19, 2026
… and CRD

Closes #1308. Squashed rebase of the original 14-commit branch onto
current main; drops stale golden-file drift and a review-cycle
import-reordering revert that had accumulated through iteration.

Chart side (`charts/console/`):
  - New `gateway` values block alongside `ingress`. Mutual exclusion
    enforced in render: enabling both fails with a clear error.
  - `gateway.go` renders a single HTTPRoute attached to the supplied
    parent Gateway(s) via `parentRefs` + optional `sectionName`.
  - Notes template shows the Gateway URL when gateway is enabled (and
    the Ingress URL when not).
  - Tests: mutual-exclusion, gateway-only, gateway→ingress switch,
    gateway removal scenarios.

Operator side (`operator/`):
  - `Console` CRD gains a `spec.gateway` field with the same shape as
    the chart's values; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1` so the Console reconciler can
    watch HTTPRoutes.
  - RBAC adds `gateway.networking.k8s.io/httproutes` perms.
  - Console controller's `SetupWithManager` skips the HTTPRoute watch
    if the Gateway API CRDs aren't installed in the cluster (graceful
    degradation; same pattern used for ServiceMonitor).

Bumps `sigs.k8s.io/gateway-api` to v1.5.1 (workspace-wide).

Validated end-to-end on EKS 1.34 in tandem with PR #1447's TLSRoute
support; both routes coexisted cleanly on the same Envoy Gateway. See:
#1329 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@david-yu david-yu force-pushed the 1308-add-gateway-support-to-console-helm-chart branch from 2298c12 to daaca60 Compare May 19, 2026 19:25
david-yu added a commit that referenced this pull request May 19, 2026
Closes #1361. Squashed rebase of the original 14-commit branch onto
current main; consolidates the iterative CI/lint fixes and includes
the v1alpha2 scheme registration fix surfaced during the tandem
PR #1329 + #1447 e2e test on EKS 1.34.

Design:
  - User brings their own Gateway (TLSRoute-capable, e.g. Envoy
    Gateway). The chart only manages TLSRoute + ClusterIP backend
    services.
  - Per-listener `gateway: true` opt-in enables gradual migration.
    Traditional NodePort/LoadBalancer listeners and TLSRoute listeners
    coexist on different ports.
  - SNI-based routing: each broker gets a unique hostname via
    `host` / `hostTemplate` per listener.
  - Bootstrap TLSRoute handles initial client connections; per-broker
    TLSRoutes handle direct broker connections after metadata
    discovery.

Chart side (`charts/redpanda/`):
  - `external.gateway` block with `enabled`, `parentRefs`,
    `advertisedPort`.
  - Per-listener `gateway`, `host`, `hostTemplate` fields on
    `listeners.{kafka,http,admin,schemaRegistry}.external.*`.
  - `tlsroute.go` renders TLSRoute resources (bootstrap + per-broker)
    with proper SNI hostnames.
  - `service.gateway.go` renders ClusterIP backend services.
  - LoadBalancer / NodePort service rendering skips gateway-opted
    listeners so they coexist on different ports.
  - `secrets.go` constructs the per-listener gateway-aware advertised
    address.

Operator side (`operator/`):
  - `Redpanda` CRD gains the `external.gateway` and per-listener
    fields; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1alpha2` (TLSRoute + TLSRouteList +
    ListOptions) so the controller-runtime cache can List/Watch the
    chart-rendered TLSRoute resources. The chart's lightweight
    TLSRoute struct stays for gotohelm rendering; the type the
    operator watches via `Types()` is the upstream
    `gatewayv1alpha2.TLSRoute`.
  - RBAC adds `gateway.networking.k8s.io/tlsroutes` perms.

Validated end-to-end on EKS 1.34 with Envoy Gateway v1.2.6, TLS
Passthrough mode, OMB at 10 Mbps + Console k6 in tandem:
#1447 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
david-yu added a commit that referenced this pull request May 19, 2026
… and CRD

Closes #1308. Squashed rebase of the original 14-commit branch onto
current main; drops stale golden-file drift and a review-cycle
import-reordering revert that had accumulated through iteration.

Chart side (`charts/console/`):
  - New `gateway` values block alongside `ingress`. Mutual exclusion
    enforced in render: enabling both fails with a clear error.
  - `gateway.go` renders a single HTTPRoute attached to the supplied
    parent Gateway(s) via `parentRefs` + optional `sectionName`.
  - Notes template shows the Gateway URL when gateway is enabled (and
    the Ingress URL when not).
  - Tests: mutual-exclusion, gateway-only, gateway→ingress switch,
    gateway removal scenarios.

Operator side (`operator/`):
  - `Console` CRD gains a `spec.gateway` field with the same shape as
    the chart's values; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1` so the Console reconciler can
    watch HTTPRoutes.
  - RBAC adds `gateway.networking.k8s.io/httproutes` perms.
  - Console controller's `SetupWithManager` skips the HTTPRoute watch
    if the Gateway API CRDs aren't installed in the cluster (graceful
    degradation; same pattern used for ServiceMonitor).

Bumps `sigs.k8s.io/gateway-api` to v1.5.1 (workspace-wide).

Validated end-to-end on EKS 1.34 in tandem with PR #1447's TLSRoute
support; both routes coexisted cleanly on the same Envoy Gateway. See:
#1329 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@david-yu david-yu force-pushed the 1308-add-gateway-support-to-console-helm-chart branch from daaca60 to 54e6d17 Compare May 19, 2026 19:35
david-yu added a commit that referenced this pull request May 19, 2026
Closes #1361. Squashed rebase of the original 14-commit branch onto
current main; consolidates the iterative CI/lint fixes and includes
the v1alpha2 scheme registration fix surfaced during the tandem
PR #1329 + #1447 e2e test on EKS 1.34.

Design:
  - User brings their own Gateway (TLSRoute-capable, e.g. Envoy
    Gateway). The chart only manages TLSRoute + ClusterIP backend
    services.
  - Per-listener `gateway: true` opt-in enables gradual migration.
    Traditional NodePort/LoadBalancer listeners and TLSRoute listeners
    coexist on different ports.
  - SNI-based routing: each broker gets a unique hostname via
    `host` / `hostTemplate` per listener.
  - Bootstrap TLSRoute handles initial client connections; per-broker
    TLSRoutes handle direct broker connections after metadata
    discovery.

Chart side (`charts/redpanda/`):
  - `external.gateway` block with `enabled`, `parentRefs`,
    `advertisedPort`.
  - Per-listener `gateway`, `host`, `hostTemplate` fields on
    `listeners.{kafka,http,admin,schemaRegistry}.external.*`.
  - `tlsroute.go` renders TLSRoute resources (bootstrap + per-broker)
    with proper SNI hostnames.
  - `service.gateway.go` renders ClusterIP backend services.
  - LoadBalancer / NodePort service rendering skips gateway-opted
    listeners so they coexist on different ports.
  - `secrets.go` constructs the per-listener gateway-aware advertised
    address.

Operator side (`operator/`):
  - `Redpanda` CRD gains the `external.gateway` and per-listener
    fields; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1alpha2` (TLSRoute + TLSRouteList +
    ListOptions) so the controller-runtime cache can List/Watch the
    chart-rendered TLSRoute resources. The chart's lightweight
    TLSRoute struct stays for gotohelm rendering; the type the
    operator watches via `Types()` is the upstream
    `gatewayv1alpha2.TLSRoute`.
  - RBAC adds `gateway.networking.k8s.io/tlsroutes` perms.

Validated end-to-end on EKS 1.34 with Envoy Gateway v1.2.6, TLS
Passthrough mode, OMB at 10 Mbps + Console k6 in tandem:
#1447 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
david-yu added a commit that referenced this pull request May 19, 2026
Closes #1361. Squashed rebase of the original 14-commit branch onto
current main; consolidates the iterative CI/lint fixes and includes
the v1alpha2 scheme registration fix surfaced during the tandem
PR #1329 + #1447 e2e test on EKS 1.34.

Design:
  - User brings their own Gateway (TLSRoute-capable, e.g. Envoy
    Gateway). The chart only manages TLSRoute + ClusterIP backend
    services.
  - Per-listener `gateway: true` opt-in enables gradual migration.
    Traditional NodePort/LoadBalancer listeners and TLSRoute listeners
    coexist on different ports.
  - SNI-based routing: each broker gets a unique hostname via
    `host` / `hostTemplate` per listener.
  - Bootstrap TLSRoute handles initial client connections; per-broker
    TLSRoutes handle direct broker connections after metadata
    discovery.

Chart side (`charts/redpanda/`):
  - `external.gateway` block with `enabled`, `parentRefs`,
    `advertisedPort`.
  - Per-listener `gateway`, `host`, `hostTemplate` fields on
    `listeners.{kafka,http,admin,schemaRegistry}.external.*`.
  - `tlsroute.go` renders TLSRoute resources (bootstrap + per-broker)
    with proper SNI hostnames.
  - `service.gateway.go` renders ClusterIP backend services.
  - LoadBalancer / NodePort service rendering skips gateway-opted
    listeners so they coexist on different ports.
  - `secrets.go` constructs the per-listener gateway-aware advertised
    address.

Operator side (`operator/`):
  - `Redpanda` CRD gains the `external.gateway` and per-listener
    fields; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1alpha2` (TLSRoute + TLSRouteList +
    ListOptions) so the controller-runtime cache can List/Watch the
    chart-rendered TLSRoute resources. The chart's lightweight
    TLSRoute struct stays for gotohelm rendering; the type the
    operator watches via `Types()` is the upstream
    `gatewayv1alpha2.TLSRoute`.
  - RBAC adds `gateway.networking.k8s.io/tlsroutes` perms.

Validated end-to-end on EKS 1.34 with Envoy Gateway v1.2.6, TLS
Passthrough mode, OMB at 10 Mbps + Console k6 in tandem:
#1447 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
david-yu added a commit that referenced this pull request May 19, 2026
… and CRD

Closes #1308. Squashed rebase of the original 14-commit branch onto
current main; drops stale golden-file drift and a review-cycle
import-reordering revert that had accumulated through iteration.

Chart side (`charts/console/`):
  - New `gateway` values block alongside `ingress`. Mutual exclusion
    enforced in render: enabling both fails with a clear error.
  - `gateway.go` renders a single HTTPRoute attached to the supplied
    parent Gateway(s) via `parentRefs` + optional `sectionName`.
  - Notes template shows the Gateway URL when gateway is enabled (and
    the Ingress URL when not).
  - Tests: mutual-exclusion, gateway-only, gateway→ingress switch,
    gateway removal scenarios.

Operator side (`operator/`):
  - `Console` CRD gains a `spec.gateway` field with the same shape as
    the chart's values; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1` so the Console reconciler can
    watch HTTPRoutes.
  - RBAC adds `gateway.networking.k8s.io/httproutes` perms.
  - Console controller's `SetupWithManager` skips the HTTPRoute watch
    if the Gateway API CRDs aren't installed in the cluster (graceful
    degradation; same pattern used for ServiceMonitor).

Bumps `sigs.k8s.io/gateway-api` to v1.5.1 (workspace-wide).

Validated end-to-end on EKS 1.34 in tandem with PR #1447's TLSRoute
support; both routes coexisted cleanly on the same Envoy Gateway. See:
#1329 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@david-yu david-yu force-pushed the 1308-add-gateway-support-to-console-helm-chart branch from 54e6d17 to b6d0d18 Compare May 19, 2026 19:59
david-yu added a commit that referenced this pull request May 19, 2026
… and CRD

Closes #1308. Squashed rebase of the original 14-commit branch onto
current main; drops stale golden-file drift and a review-cycle
import-reordering revert that had accumulated through iteration.

Chart side (`charts/console/`):
  - New `gateway` values block alongside `ingress`. Mutual exclusion
    enforced in render: enabling both fails with a clear error.
  - `gateway.go` renders a single HTTPRoute attached to the supplied
    parent Gateway(s) via `parentRefs` + optional `sectionName`.
  - Notes template shows the Gateway URL when gateway is enabled (and
    the Ingress URL when not).
  - Tests: mutual-exclusion, gateway-only, gateway→ingress switch,
    gateway removal scenarios.

Operator side (`operator/`):
  - `Console` CRD gains a `spec.gateway` field with the same shape as
    the chart's values; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1` so the Console reconciler can
    watch HTTPRoutes.
  - RBAC adds `gateway.networking.k8s.io/httproutes` perms.
  - Console controller's `SetupWithManager` skips the HTTPRoute watch
    if the Gateway API CRDs aren't installed in the cluster (graceful
    degradation; same pattern used for ServiceMonitor).

Bumps `sigs.k8s.io/gateway-api` to v1.5.1 (workspace-wide).

Validated end-to-end on EKS 1.34 in tandem with PR #1447's TLSRoute
support; both routes coexisted cleanly on the same Envoy Gateway. See:
#1329 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@david-yu david-yu force-pushed the 1308-add-gateway-support-to-console-helm-chart branch from b6d0d18 to 42a0e50 Compare May 19, 2026 20:21
… and CRD

Closes #1308. Squashed rebase of the original 14-commit branch onto
current main; drops stale golden-file drift and a review-cycle
import-reordering revert that had accumulated through iteration.

Chart side (`charts/console/`):
  - New `gateway` values block alongside `ingress`. Mutual exclusion
    enforced in render: enabling both fails with a clear error.
  - `gateway.go` renders a single HTTPRoute attached to the supplied
    parent Gateway(s) via `parentRefs` + optional `sectionName`.
  - Notes template shows the Gateway URL when gateway is enabled (and
    the Ingress URL when not).
  - Tests: mutual-exclusion, gateway-only, gateway→ingress switch,
    gateway removal scenarios.

Operator side (`operator/`):
  - `Console` CRD gains a `spec.gateway` field with the same shape as
    the chart's values; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1` so the Console reconciler can
    watch HTTPRoutes.
  - RBAC adds `gateway.networking.k8s.io/httproutes` perms.
  - Console controller's `SetupWithManager` skips the HTTPRoute watch
    if the Gateway API CRDs aren't installed in the cluster (graceful
    degradation; same pattern used for ServiceMonitor).

Bumps `sigs.k8s.io/gateway-api` to v1.5.1 (workspace-wide).

Validated end-to-end on EKS 1.34 in tandem with PR #1447's TLSRoute
support; both routes coexisted cleanly on the same Envoy Gateway. See:
#1329 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@david-yu david-yu force-pushed the 1308-add-gateway-support-to-console-helm-chart branch from 42a0e50 to cdc9b28 Compare May 19, 2026 20:39
@david-yu
Copy link
Copy Markdown
Contributor Author

V2 operator path validation — closing out the open caveat

Following up on the prior tandem comment, which validated the chart-side HTTPRoute path and noted the v2 operator path was still open. Validated today (2026-05-19) on a fresh EKS 1.34 cluster: the v2 path works end-to-end.

Stack

Piece Value
EKS 1.34, us-west-2 (rp-gw-eks134)
Operator image 605419575229.dkr.ecr.us-west-2.amazonaws.com/redpanda-operator-pr1329-1447:f6083b79-v8 (integration of PR #1329 + #1447 + scheme fix below)
Redpanda Managed by the operator via a Redpanda CR (not helm-direct), so Console.spec.cluster.clusterRef.name=redpanda resolves
Gateway Envoy Gateway v1.2.6, TLS:9094 (Passthrough, Kafka) + HTTPS:443 (Terminate, Console) on the same redpanda-gateway

Console CR (this PR's V2 path)

apiVersion: cluster.redpanda.com/v1alpha2
kind: Console
metadata:
  name: my-console
  namespace: redpanda
spec:
  cluster:
    clusterRef:
      name: redpanda
  gateway:
    enabled: true
    parentRefs:
      - name: redpanda-gateway
        sectionName: console-https
    hostnames:
      - console.example.com
    path: /
    pathType: PathPrefix

The Console reconciler from this PR rendered a single HTTPRoute owned by the Console CR:

$ kubectl -n redpanda get httproute
NAME                 HOSTNAMES                 AGE
my-console-console   ["console.example.com"]   4h13m

$ kubectl -n redpanda get httproute my-console-console \
  -o jsonpath='{range .status.parents[*]}{.parentRef.name}/{.parentRef.sectionName} accepted={.conditions[?(@.type=="Accepted")].status}{"\n"}{end}'
redpanda-gateway/console-https accepted=True

$ kubectl -n redpanda get gateway redpanda-gateway -o json | jq '.status.listeners[]'
  kafka: attachedRoutes=4         # TLSRoutes from PR #1447
  console-https: attachedRoutes=1 # HTTPRoute from this PR's Console CR

What this validates (vs the chart-side run)

Chart-side (prior comment) V2 operator path (this run)
Console deployment Helm chart console subchart Console CR → operator rendered
HTTPRoute rendered by Helm template (_console.gateway.tpl) Console controller (charts/console/gateway.go)
cluster.clusterRef resolves? n/a ✅ Yes — operator looks up the Redpanda CR
parentRefs[].sectionName honored
Coexists with PR #1447 TLSRoutes
Accepted=True

Scheme-registration bug surfaced during V2 validation (fix on PR #1447)

The chart-side path doesn't iterate Types(), so the bug stayed hidden. The V2 operator's sync loop does — it matches rendered objects against Types() by Go type via reflect.TypeOf. PR #1447's fix to charts/redpanda/chart.go had switched Types() to return the upstream *gatewayv1alpha2.TLSRoute, but Render() returns the chart-local lightweight *redpanda.TLSRoute (gotohelm can't transpile the upstream type aliases). Result: every Redpanda reconcile errored with

.Render returned *redpanda.TLSRoute which isn't present in .Types

Trying to register both types at the same v1alpha2 GVK panics with Double registration of different types. Registering only the chart-local kind without the upstream package also fails: no kind "ListOptions" is registered for version "gateway.networking.k8s.io/v1alpha2" on the cache's List call. And registering gatewayv1alpha2.TLSRouteList (the upstream List type) makes List succeed but its Items []TLSRoute field deserializes the upstream type, which then breaks GVKFor lookups in the sync loop.

Fix (will be backported to PR #1447's squash commit): define a chart-local TLSRouteList so the List type's Items slice stays typed against *redpanda.TLSRoute. Register both kinds at the v1alpha2 GVK alongside metav1.AddToGroupVersion for ListOptions — no upstream package needed.

// charts/redpanda/tlsroute.go
type TLSRouteList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []TLSRoute `json:"items"`
}

// charts/redpanda/chart.go
func addTLSRouteToScheme(s *runtime.Scheme) {
    gv := schema.GroupVersion{Group: "gateway.networking.k8s.io", Version: "v1alpha2"}
    s.AddKnownTypeWithName(gv.WithKind("TLSRoute"), &TLSRoute{})
    s.AddKnownTypeWithName(gv.WithKind("TLSRouteList"), &TLSRouteList{})
    metav1.AddToGroupVersion(s, gv)
}

Once this lands in PR #1447, V2 Redpanda CR reconciles run cleanly and the V2 Console CR path becomes usable on top of it.

10 Mbps OMB on the TLSRoute (concurrent with the Console HTTPRoute up and serving)

Same workload shape as the prior tandem run: 1 × 12 partitions, 1 KiB records, 4 producers / 4 consumers, target 1,250 msg/s, 2 min warmup + 10 min steady-state.

Kafka throughput (steady-state)

Metric Value
Produced msg/s — avg 1,250.0
Produced MB/s — avg 1.28 (≈10 Mbps as targeted)
Consumed msg/s — avg 1,250.0
Backlog peak 1
Produce errors 0

Publish latency — steady-state aggregate

Percentile V2 path (this run) Chart-direct (prior 10 Mbps tandem run)
avg 4.63 ms 4.65 ms
p50 4.27 ms 4.39 ms
p75 5.16 ms 5.32 ms
p95 6.83 ms 6.85 ms
p99 8.72 ms 8.74 ms
p99.9 18.00 ms 14.22 ms
p99.99 21.54 ms 17.02 ms

Within run-to-run noise. The p99.9 ticked up ~4 ms vs the chart-direct run — plausibly the extra reconcile overhead from the V2 operator continuously managing the StatefulSet, but well within the bound; no errors or backlog.

End-to-end latency — aggregate quantiles

Percentile Value (ms)
p50 4.00
p95 7.00
p99 9.00
p99.9 63.00
p99.99 342.00

Raw artifacts (in eks-api-gateway-tls/results/2026-05-19-v2-tandem/)

  • omb-results.json — full OMB output
  • omb-stdout.log — OMB pod stdout

🤖 Generated with Claude Code

david-yu added a commit that referenced this pull request May 20, 2026
Closes #1361. Squashed rebase of the original 14-commit branch onto
current main; consolidates the iterative CI/lint fixes and includes
the v1alpha2 scheme registration fix surfaced during the tandem
PR #1329 + #1447 e2e test on EKS 1.34.

Design:
  - User brings their own Gateway (TLSRoute-capable, e.g. Envoy
    Gateway). The chart only manages TLSRoute + ClusterIP backend
    services.
  - Per-listener `gateway: true` opt-in enables gradual migration.
    Traditional NodePort/LoadBalancer listeners and TLSRoute listeners
    coexist on different ports.
  - SNI-based routing: each broker gets a unique hostname via
    `host` / `hostTemplate` per listener.
  - Bootstrap TLSRoute handles initial client connections; per-broker
    TLSRoutes handle direct broker connections after metadata
    discovery.

Chart side (`charts/redpanda/`):
  - `external.gateway` block with `enabled`, `parentRefs`,
    `advertisedPort`.
  - Per-listener `gateway`, `host`, `hostTemplate` fields on
    `listeners.{kafka,http,admin,schemaRegistry}.external.*`.
  - `tlsroute.go` renders TLSRoute resources (bootstrap + per-broker)
    with proper SNI hostnames.
  - `service.gateway.go` renders ClusterIP backend services.
  - LoadBalancer / NodePort service rendering skips gateway-opted
    listeners so they coexist on different ports.
  - `secrets.go` constructs the per-listener gateway-aware advertised
    address.

Operator side (`operator/`):
  - `Redpanda` CRD gains the `external.gateway` and per-listener
    fields; goverter conversion auto-generated.
  - V2 scheme registers `gatewayv1alpha2` (TLSRoute + TLSRouteList +
    ListOptions) so the controller-runtime cache can List/Watch the
    chart-rendered TLSRoute resources. The chart's lightweight
    TLSRoute struct stays for gotohelm rendering; the type the
    operator watches via `Types()` is the upstream
    `gatewayv1alpha2.TLSRoute`.
  - RBAC adds `gateway.networking.k8s.io/tlsroutes` perms.

Validated end-to-end on EKS 1.34 with Envoy Gateway v1.2.6, TLS
Passthrough mode, OMB at 10 Mbps + Console k6 in tandem:
#1447 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Gateway API support in Console Helm chart (alongside Ingress)

3 participants