You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The SmartEM Agent runs on Windows EPU workstations alongside microscopes, outside the k8s cluster that hosts smartem-decisions. It needs to reach the backend API to:
Read from / write to acquisition data over REST (CRUD on sessions, gridsquares, foilholes, etc.)
Receive ML recommendations via SSE
Authenticate with Keycloak using the SmartEM_Agent confidential client (client-credentials grant, per smartem-decisions#284)
Current state
Development (k3s): the agent reaches the backend via the smartem-http-api-service NodePort 30080 (http://<node-ip>:30080). Works only because the dev cluster lives on the same network as the developer machine.
Staging / production: no defined story. The frontend k8s manifests landing in feat(k8s): deploy smartem-frontend across dev/staging/production #205 cover browser traffic only — the SPA pod's own nginx reverse-proxies /api/ to smartem-http-api-serviceinternally inside the pod, so the existing frontend ingress is not a route the agent can use from outside the cluster.
What needs deciding
A deployment-friendly story for the agent's outside-cluster connectivity to the backend in non-dev environments. Sketch of the option space:
Option A — Separate backend ingress
New k8s/environments/{staging,production}/smartem-http-api-ingress.yaml routing a dedicated host (e.g. smartem-api-staging.diamond.ac.uk / smartem-api.diamond.ac.uk) to smartem-http-api-service.
Pros: clean separation, agent connects to a stable, well-named host; independent failure domain from the frontend; sizing matches workload (SSE + bulk REST, not browser navigations).
Cons: extra TLS cert, extra DNS record, extra ingress rule.
Option B — Agent traffic via the frontend ingress
Reuse smartem-staging.diamond.ac.uk / smartem.diamond.ac.uk. Either (i) keep the SPA pod's nginx in the path, or (ii) add a second backend rule alongside / so the cluster ingress controller proxies /api/ to smartem-http-api-service directly.
Pros: one hostname, one cert, one ingress rule (variant ii also one fewer hop).
Cons (variant i): couples agent traffic to the SPA pod's nginx, intertwining failure modes; SPA pod sized for browser traffic, not N concurrent SSE streams. (variant ii): mixes user-facing and machine-facing traffic on the same name; same-origin is irrelevant for the agent (not browser-based).
Option C — LoadBalancer on the backend service
Set smartem-http-api-service.type: LoadBalancer in staging/production (or sit a MetalLB / on-prem LB in front).
Pros: simple, no ingress controller involved.
Cons: on-prem LB scarcity; no TLS termination by default; one LB IP per service.
Option D — Other
E.g. service mesh, per-microscope tunnel, agent goes through a relay. Probably not warranted for the current shape of the workload but worth a brief mention.
Constraints to factor in
Auth: agent uses SmartEM_Agent client-credentials against the DLS Keycloak realm. The backend already accepts tokens with azp: SmartEM_Agent (added to KEYCLOAK_ALLOWED_AZP in feat(k8s): deploy smartem-frontend across dev/staging/production #205). No CORS concerns since the agent isn't a browser.
TLS: agent traffic should be TLS-terminated at the ingress in non-dev environments. The agent does not need to live on the DLS internal network if a properly-secured public ingress is exposed.
SSE: the agent subscribes to ML recommendations via long-lived SSE streams. Whichever route is chosen must support that (ingress controller timeouts, response buffering off, keep-alives).
Scale: multiple agents per facility, each holding at least one SSE connection plus periodic REST traffic.
Locality: on-prem at DLS the agent and cluster will share the DLS network; the path can be much shorter than a public ingress. Worth deciding whether to design for a single deployment shape or two (DLS-internal vs federated facility).
Related
smartem-decisions#284 — agent auth strategy (closed: Keycloak client-credentials with SmartEM_Agent)
smartem-devtools#205 — frontend k8s deploy (adds the frontend ingress; explicitly defers this agent connectivity story)
smartem-devtools#181 — broader k8s modernisation (Gateway API, Ingress, ClusterIP); overlapping scope, this issue is the narrower agent-specific slice
smartem-devtools#179 — staging/production manifests vs on-prem reality (where this lands in practice)
Out of scope
Implementing the choice. This issue is to decide; a follow-up tracks the manifest additions.
Background
The SmartEM Agent runs on Windows EPU workstations alongside microscopes, outside the k8s cluster that hosts smartem-decisions. It needs to reach the backend API to:
SmartEM_Agentconfidential client (client-credentials grant, per smartem-decisions#284)Current state
smartem-http-api-serviceNodePort 30080 (http://<node-ip>:30080). Works only because the dev cluster lives on the same network as the developer machine./api/tosmartem-http-api-serviceinternally inside the pod, so the existing frontend ingress is not a route the agent can use from outside the cluster.What needs deciding
A deployment-friendly story for the agent's outside-cluster connectivity to the backend in non-dev environments. Sketch of the option space:
Option A — Separate backend ingress
k8s/environments/{staging,production}/smartem-http-api-ingress.yamlrouting a dedicated host (e.g.smartem-api-staging.diamond.ac.uk/smartem-api.diamond.ac.uk) tosmartem-http-api-service.Option B — Agent traffic via the frontend ingress
smartem-staging.diamond.ac.uk/smartem.diamond.ac.uk. Either (i) keep the SPA pod's nginx in the path, or (ii) add a second backend rule alongside/so the cluster ingress controller proxies/api/tosmartem-http-api-servicedirectly.Option C — LoadBalancer on the backend service
smartem-http-api-service.type: LoadBalancerin staging/production (or sit a MetalLB / on-prem LB in front).Option D — Other
E.g. service mesh, per-microscope tunnel, agent goes through a relay. Probably not warranted for the current shape of the workload but worth a brief mention.
Constraints to factor in
SmartEM_Agentclient-credentials against the DLS Keycloak realm. The backend already accepts tokens withazp: SmartEM_Agent(added toKEYCLOAK_ALLOWED_AZPin feat(k8s): deploy smartem-frontend across dev/staging/production #205). No CORS concerns since the agent isn't a browser.Related
SmartEM_Agent)Out of scope