feat(ec2): k8s NetworkPolicy enforcement for security groups (#1745 phase 4)#1757
Open
vieiralucas wants to merge 4 commits into
Open
Conversation
Backing containers all shared the default bridge, so instances in different
VPCs could reach each other and there was no L3 segmentation. Attach each
instance's container to a per-subnet daemon network instead:
- RunInstances computes an `InstanceNetwork { subnet_id, internal }` from the
resolved subnet (internal = the subnet has no `0.0.0.0/0 -> igw` route) and
passes it to the runtime.
- The Docker backend ensures `fakecloud-subnet-<id>` exists (idempotent;
`--internal` for private subnets), labels it `fakecloud-subnet=<id>` plus the
shared `fakecloud-instance` ownership label so the startup reaper prunes it,
and attaches the container with `--network`.
- Same-subnet instances share a bridge and can talk; different VPCs/subnets get
different bridges and cannot route to each other. Network creation is
best-effort: on failure the instance still boots on the default bridge (no
regression vs metadata-only).
- k8s pods keep their flat network (isolation there is a NetworkPolicy concern,
phase 4). Subnet placement is captured in the runtime record so persisted
instances recover onto the same network after a restart, and so phase-5
introspection can report the backing network.
Tests: e2e (Docker-gated, hard-fails in CI) proving same-subnet reachability,
cross-VPC isolation (ping passes/fails accordingly), and that private subnets
back onto `--internal` networks while public/default subnets do not.
The per-subnet network arg added for phase-2 changed run_instance's signature; the feature-gated k8s integration test (only compiled in the kind CI job) still called the 3-arg form and would fail to compile there.
Phase 2 isolates subnets at L3 but does nothing within a subnet -- SG/NACL rules still block no traffic. Add a network-driver abstraction that translates the SG/NACL model into an nftables ruleset and applies it on the host, scoped to fakecloud's per-subnet bridges. - runtime::firewall: pure, exhaustively unit-tested renderer turning a per-subnet model (instances + their flattened SG ingress/egress + subnet NACL denies) into an `inet fakecloud_ec2` nft table -- stateful (established,related accept), per-instance allow-then-default-deny, NACL denies first. Protocols/ports/CIDRs/icmp/referenced-groups all handled. - FirewallEnforcer: the driver. nftables when capable, else a degraded no-op. Enforcement is opt-in via FAKECLOUD_EC2_SG_ENFORCEMENT and capability-gated by an `nft list ruleset` probe; when requested but unbacked (CI, Docker Desktop, rootless podman) it warns once and degrades to metadata-only -- phase-2 isolation still holds, no regression. Apply is an atomic `nft -f -` swap of fakecloud's own table (never touches docker's rules). - service::firewall_model: builds the model across every account partition (the host nft table is global) -- referenced security groups expand to member /32s so the default SG's allow-from-self works; only running, subnet-placed instances are enforced. - Re-applied on RunInstances (once up), Start/Stop/Terminate, Authorize/Revoke ingress/egress, and network-ACL entry/association edits. All reconciles are background + skipped entirely when enforcement is disabled (the default). k8s keeps a disabled enforcer (isolation there is NetworkPolicy, phase 4). Tests: 11 new unit tests for ruleset rendering + model building; a Docker-gated e2e proving the degrade path (enforcement requested, no NET_ADMIN -> instances still boot and same-subnet reachability is unchanged).
…hase 4) The Docker backend filters traffic with host nftables (phase 3). k8s Pods share a flat L3 network with no bridge to hook, so isolation there is expressed as NetworkPolicy objects and enforced by the cluster CNI -- if it enforces NetworkPolicy at all. - runtime::netpolicy: pure, unit-tested translation of the shared per-instance SG model into one NetworkPolicy per instance (podSelector on the instance's `fakecloud-ec2` label; ingress/egress as ipBlock peers + TCP/UDP ports; referenced groups arrive as member /32s; all-protocols/ICMP ride as "all"). - CniDriver: pluggable CNI abstraction (Calico + Cilium known-enforcing, Unknown otherwise), detected from kube-system component names. Calico first, extendable. When the CNI doesn't enforce NetworkPolicy (e.g. kindnet) the policies are still created and a one-time startup warning is logged -- graceful degrade, never blocks Pod creation. - fakecloud-k8s client: apply_network_policy (delete-then-create), prune_network_policies (reap policies for gone instances), kube_system pod listing for CNI detection. - The phase-3 reconcile path now dispatches on backend: nftables for Docker, NetworkPolicies for k8s. The shared SG-flatten (InstanceRules) feeds both. Tests: 5 unit tests for policy translation + CNI detection; a kind integration test (security_groups_become_network_policies) asserting policies are created with the right selector/rules and pruned when the instance is gone. Also fixes the feature-gated k8s integration test's run_instance call to the phase-2 4-arg signature.
37592c0 to
5e4e1c2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 4 of EC2 real network isolation (#1745). Stacked on #1756 (phase 3); base retargets as the stack merges.
The Docker backend filters with host nftables (phase 3). k8s Pods share a flat L3 network with no bridge to hook, so isolation there is expressed as NetworkPolicy objects, enforced by the cluster CNI.
runtime::netpolicy— pure, unit-tested translation of the shared per-instance SG model into one NetworkPolicy per instance:podSelectoron the instance'sfakecloud-ec2label; ingress/egress asipBlockpeers + TCP/UDP ports; referenced security groups arrive as member/32s; all-protocols/ICMP ride as "all ports".CniDriver— pluggable CNI abstraction (Calico + Cilium known-enforcing, Unknown otherwise), detected fromkube-systemcomponent names. Calico first, extendable. When the CNI doesn't enforce NetworkPolicy (e.g. kind'skindnet), policies are still created and a one-time startup warning is logged — graceful degrade, never blocks Pod creation.fakecloud-k8sclient —apply_network_policy(delete-then-create),prune_network_policies(reap policies for terminated instances),kube_system_pod_namesfor CNI detection.InstanceRules) feeds both.Test plan
/32, anywhere/all-proto) + CNI detection/enforcement.security_groups_become_network_policies(in the existingk8s_integrationfeature suite, run by the "K8s backend (kind)" job): a reconcile creates the NetworkPolicy with the right selector + ingress rule, and a reconcile with no instances prunes it.run_instancecall to the phase-2 4-arg signature (was only compiled in the kind job).cargo test -p fakecloud-ec2 -p fakecloud-k8s, clippy, fmt clean.Real enforcement needs a NetworkPolicy-enforcing CNI; kindnet (CI) doesn't, so the kind test asserts policy creation + pruning while the translation correctness is unit-tested.
Summary by cubic
Add Kubernetes NetworkPolicy support to enforce EC2 security groups in the k8s backend, creating one policy per instance with graceful CNI detection and fallback. Docker continues to use nftables; the control plane now dispatches by backend.
runtime::netpolicy: pure translation fromInstanceRulesto oneNetworkPolicyper instance (selector on thefakecloud-ec2label;ipBlockpeers; TCP/UDP ports; referenced SGs as /32s; ICMP/all-proto as “all ports”).CniDriverwith detection viakube-systemcomponents (Calico/Cilium enforce; unknown logs a startup warning). Policies are always created and never block Pod creation.fakecloud-k8sclient withapply_network_policy,prune_network_policies, andkube_system_pod_names.network_isolation_enforced()and prunes policies for terminated instances.run_instancecall signature.Written for commit 118e3b5. Summary will update on new commits.