Skip to content

Fix atomic commit: prevent silent data loss on merge failure (closes #29)#34

Merged
congwang-mk merged 5 commits into
mainfrom
fix-atomic-commit
May 23, 2026
Merged

Fix atomic commit: prevent silent data loss on merge failure (closes #29)#34
congwang-mk merged 5 commits into
mainfrom
fix-atomic-commit

Conversation

@congwang-mk
Copy link
Copy Markdown
Contributor

Summary

Fixes #29. BranchManager::commit previously swallowed copy errors (let _ = fs::create_dir_all(...) / let _ = storage::copy_entry(...)) and then deleted the branch's delta directory unconditionally, so a failed merge (ENOSPC, EACCES, EIO, quota) was reported as Ok while the only copy of the data was destroyed — contradicting the "Atomic Commit" claim in the README.

  • Add a StagedMerge RAII helper that copies each delta file to a temp sibling of its final destination (the only fallible step), then publishes them with atomic same-filesystem fs::rename only after all copies succeed. If dropped without commit() (e.g. an error propagates), it removes the staged temps and leaves destination data untouched.
  • Make walk_files' callback return Result<()> so copy failures propagate instead of being structurally un-reportable.
  • Rewrite both commit paths (parent-is-main → base filesystem; nested → parent delta dir) to stage-then-publish. On failure the branch and its delta are preserved for retry or abort — never a false success.

Scope: this delivers error-atomicity within a running daemon. Crash-atomicity is out of scope because branches are in-memory only (no on-disk recovery), so a daemon crash discards the branch map regardless. A rolled-back merge may leave benign empty directories in the destination (documented in code).

Test Plan

  • cargo test --lib — 3 new StagedMerge unit tests (publish-all, rollback-on-drop, failure-cleanup) pass
  • cargo test --test test_ioctl -- --ignored — 7/7 pass (incl. commit + nested commit)
  • cargo test --test test_integration -- --ignored — 22/22 pass (commit/deletion/symlink/rename happy paths unchanged)

🤖 Generated with Claude Code

Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Cong Wang <cwang@multikernel.io>
Copy delta files to temp siblings first and only publish via atomic
rename once all copies succeed. Propagate copy errors instead of
swallowing them, and preserve the branch + delta on failure so a failed
merge can never be reported as success or destroy the only delta copy.

Signed-off-by: Cong Wang <cwang@multikernel.io>
Re-review of the first cut found that publishing copies before applying
deletions broke replacing a base directory with a file: the publish
rename hit EISDIR and the whole commit failed with EIO (regression vs
the pre-fix behavior). Two changes:

- StagedMerge now stages deletions too: each tombstoned entry is renamed
  aside to a .branchfs-trash sibling and only discarded on commit, so a
  failed merge restores it on Drop. Deletions are staged before copies,
  and stage_copy clears a directory sitting at a copy's destination
  (reachable via rename, which leaves no tombstone) so publishing a file
  over it succeeds. Commit is now all-or-nothing for deletions as well as
  copies.
- Build temp/trash side-paths from the raw OsStr instead of a lossy
  String so distinct non-UTF8 filenames never collide on one temp path.

Adds unit tests for staged deletions, dir->file replacement (with and
without an explicit tombstone) and its rollback, nested paths, and
non-UTF8 naming; adds an ioctl integration test that replaces a base
directory with a file and commits (fails on the previous code).

Signed-off-by: Cong Wang <cwang@multikernel.io>
The project had no dev-dependencies and the integration tests already
manage their own temp directories with manual Drop cleanup. Replace
tempfile::tempdir() in the StagedMerge unit tests with a small RAII
TmpDir helper that reuses the uuid crate already in [dependencies],
removing tempfile and its transitive deps from the lockfile.

Signed-off-by: Cong Wang <cwang@multikernel.io>
@congwang-mk congwang-mk merged commit 2b6c1e2 into main May 23, 2026
4 checks passed
@congwang-mk congwang-mk deleted the fix-atomic-commit branch May 23, 2026 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Critical: “atomic commit” is not atomic and can silently drop data

1 participant