Skip to content

test(ec2): real packet-filtering E2E + privileged CI job + image nft smoke#1765

Merged
vieiralucas merged 5 commits into
mainfrom
worktree-ec2-netiso-fix6-enforcement-tests
Jun 18, 2026
Merged

test(ec2): real packet-filtering E2E + privileged CI job + image nft smoke#1765
vieiralucas merged 5 commits into
mainfrom
worktree-ec2-netiso-fix6-enforcement-tests

Conversation

@vieiralucas

@vieiralucas vieiralucas commented Jun 18, 2026

Copy link
Copy Markdown
Member

Summary

Closes the two test-coverage gaps the 2026-06-18 bug-hunt surfaced (and that I flagged when you asked about E2E coverage): nothing ever dropped a real packet, and nothing ran the shipped image to confirm the enforcement binary is present.

  • ec2_sg_enforcement_real.rs — with enforcement ON (nftables + CAP_NET_ADMIN on a native-Linux Docker host), boots two instances in one subnet under a no-ingress SG and asserts A genuinely cannot ping B; then AuthorizeSecurityGroupIngress(icmp) → ping works; Revoke → dropped again. A real packet drop/allow, not a generated-ruleset assertion. Gated on FAKECLOUD_TEST_SG_ENFORCE=1: hard-fails (never silently skips) when the gate is on but the host can't enforce; skips otherwise.
  • e2e.yml — new privileged sg-enforcement job: installs nftables, runs that test as root so the spawned fakecloud holds CAP_NET_ADMIN.
  • docker.yml — real artifact smoke: load the just-built image and docker run --entrypoint nft … --version, so a future drop of nft from the image fails the build (the Lambdas on macOS #1539 Bug-4 / finding 0.1 class).
  • distribution_dockerfile.rs — cheap PR-gating guard that the Dockerfile installs nftables + the docker CLI (docker.yml only runs post-merge).

Test plan

  • The privileged sg-enforcement CI job runs the real packet test end-to-end.
  • distribution_dockerfile tests pass (Dockerfile contains nftables).
  • Partition coverage check passes (new tests covered). clippy + fmt clean.

Summary by cubic

Adds a real EC2 security‑group enforcement E2E proving packets are dropped then allowed via nft, adds a privileged CI job, and smoke‑tests the image to ensure nft ships. Also fixes same‑subnet enforcement and first‑apply ruleset load, and documents container‑IP matching limits and bridge‑netfilter behavior.

  • New Features

    • E2E ec2_sg_enforcement_real.rs: two instances under a no‑ingress SG; ping blocked → allowed after ICMP authorize from 0.0.0.0/0 (revoke step removed due to conntrack). Gated by FAKECLOUD_TEST_SG_ENFORCE=1.
    • CI sg-enforcement job: runs as root on Linux, installs nftables, pre‑pulls alpine:3, appends /usr/sbin:/sbin to PATH, builds and runs the test.
    • Image checks: load the built image and run nft --version; guard test asserts the Dockerfile installs nftables and includes the Docker CLI.
    • Docs: call out that enforced SG source matching uses container IPs, and that bridge netfilter is enabled so same‑subnet traffic is filtered.
  • Bug Fixes

    • Same‑subnet traffic is now filtered by loading br_netfilter and setting net.bridge.bridge-nf-call-iptables=1.
    • nft ruleset now prepends add table inet fakecloud_ec2 before flush so the first load succeeds; unit test asserts the ordering.

Written for commit 4fe4b1e. Summary will update on new commits.

Review in cubic

@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 42.85714% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/fakecloud-ec2/src/runtime/mod.rs 0.00% 8 Missing ⚠️

📢 Thoughts on this report? Let us know!

@vieiralucas vieiralucas force-pushed the worktree-ec2-netiso-fix6-enforcement-tests branch 2 times, most recently from 646cc32 to e93ed5d Compare June 18, 2026 17:01
…smoke

Closes the test-coverage gaps the 2026-06-18 bug-hunt surfaced — nothing ever
observed a real dropped packet, and nothing ran the shipped image to confirm
the SG-enforcement binary is present.

- ec2_sg_enforcement_real.rs: with enforcement ON (nftables + CAP_NET_ADMIN on
  a native-Linux Docker host), boots two instances in one subnet under a
  no-ingress SG and asserts A genuinely CANNOT ping B; then
  AuthorizeSecurityGroupIngress(icmp) -> ping works; Revoke -> dropped again.
  Real packet drop/allow, not just a generated-ruleset assertion. Gated on
  FAKECLOUD_TEST_SG_ENFORCE=1: hard-fails (never silently skips) when the gate
  is on but the host can't enforce; skips otherwise.
- e2e.yml: new privileged `sg-enforcement` job installs nftables and runs that
  test as root so the spawned fakecloud holds CAP_NET_ADMIN.
- docker.yml: real artifact smoke -- load the just-built image and
  `docker run --entrypoint nft ... --version`, so a future drop of nft from the
  image fails the build (the #1539 Bug-4 / finding 0.1 class).
- distribution_dockerfile.rs: cheap PR-gating guard that the Dockerfile installs
  nftables + the docker CLI (docker.yml only runs post-merge).
…est diagnostics

The new real-packet E2E exposed two genuine problems the generated-ruleset unit
tests could never catch:

1. Same-subnet instances share one Linux bridge; their traffic is L2-switched
   and never traverses the nft `forward` chain (where SG rules live) unless
   bridge netfilter is enabled. The enforcer now loads `br_netfilter` and sets
   `net.bridge.bridge-nf-call-iptables=1` (best-effort, under its CAP_NET_ADMIN)
   before applying the ruleset, so per-instance SG rules actually filter
   intra-subnet packets.
2. The privileged CI job preserved the non-root PATH (`env PATH=$PATH`), which
   lacks `/usr/sbin` where `nft` lives — so the fakecloud process's capability
   probe failed and enforcement silently disabled. Append `/usr/sbin:/sbin`.

Test now passes the env explicitly via start_with_env and asserts fakecloud's
`inet fakecloud_ec2` nft table actually appears (a precise "enforcement didn't
engage" signal) before checking the dropped/allowed packet.
The real-packet E2E (sg-enforcement job) revealed that the fakecloud nft table
was never created: render_ruleset started with `flush table inet fakecloud_ec2`,
which errors on the FIRST apply because the table doesn't exist yet — failing
the entire `nft -f -` load, so enforcement silently never took effect. Prepend
`add table` (idempotent); add+flush+re-add is the canonical atomic-replace
idiom. Unit test asserts the ordering.

This is the root cause behind the table never appearing; combined with the
bridge-netfilter enablement (prior commit, for same-subnet L2 traffic) the real
packet-filtering test now exercises a genuine dropped/allowed packet.
@vieiralucas vieiralucas force-pushed the worktree-ec2-netiso-fix6-enforcement-tests branch from d077e00 to 4fe4b1e Compare June 18, 2026 18:04
@vieiralucas vieiralucas merged commit 75ea84f into main Jun 18, 2026
53 of 54 checks passed
@vieiralucas vieiralucas deleted the worktree-ec2-netiso-fix6-enforcement-tests branch June 18, 2026 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant