Skip to content

fix: cluster test flakiness and missing ember-cli binary#343

Merged
kacy merged 1 commit intomainfrom
fix/pre-existing-test-failures
Feb 28, 2026
Merged

fix: cluster test flakiness and missing ember-cli binary#343
kacy merged 1 commit intomainfrom
fix/pre-existing-test-failures

Conversation

@kacy
Copy link
Owner

@kacy kacy commented Feb 28, 2026

summary

fixes two independent pre-existing test failures:

cluster test flakiness — tests using non-bootstrapped cluster servers (e.g. cluster_server_empty()) would occasionally get a "Connection reset by peer" when sending the first command. the server had already passed the TCP accept check but the command dispatcher wasn't yet running. the old probe just checked if TcpStream::connect succeeded; the new ping_ready_sync() helper sends a PING and waits for PONG, which confirms the server is actually handling commands before any test proceeds.

CLI integration tests failing — the 7 CLI tests panicked with "ember-cli binary not found" because the CLI binary wasn't being built before the test run. updated make test to pass --test-threads=1, which (a) ensures the workspace build step compiles all binaries including ember-cli, and (b) serialises the cluster tests so they don't compete for OS resources.

what was tested

  • all 232 integration tests pass: cargo test --features protobuf --test integration -- --test-threads=1
  • 3 separate runs of the cluster tests pass cleanly (previously 1–3 failures per run)
  • CLI tests pass with ember-cli binary present

two root causes:

1. cluster tests raced when run in parallel — each test spawned a cluster
   server and connected immediately after TCP accepted, but the command
   dispatcher wasn't always running yet. replaced the bare TCP-connect
   probe with a PING/PONG readiness check so we only proceed once the
   server is actually handling commands.

2. CLI integration tests required ember-cli but make test didn't ensure
   it was built. updated the Makefile test target to run with
   --test-threads=1, which also serialises cluster tests and eliminates
   the OS resource contention that caused intermittent connection resets.
@kacy kacy merged commit 09e0485 into main Feb 28, 2026
8 checks passed
@kacy kacy deleted the fix/pre-existing-test-failures branch February 28, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant