Skip to content

Emit named IN_CREATE/IN_DELETE for inotify directory watches#58

Open
chenhunghan wants to merge 2 commits into
sysprog21:mainfrom
chenhunghan:fix-inotify-named-events
Open

Emit named IN_CREATE/IN_DELETE for inotify directory watches#58
chenhunghan wants to merge 2 commits into
sysprog21:mainfrom
chenhunghan:fix-inotify-named-events

Conversation

@chenhunghan
Copy link
Copy Markdown
Collaborator

@chenhunghan chenhunghan commented May 29, 2026

Fixes #55.

Synthesizes named IN_CREATE/IN_DELETE for directory watches by snapshotting the directory's entry names and diffing on each NOTE_WRITE (kqueue EVFILT_VNODE reports only that the directory changed, not which child). Full rationale in the commit message.

Adds a regression test (test-inotify Test 6) that watches a fresh directory, creates a child, and asserts a named IN_CREATE is delivered.

Validation on Apple Silicon: make elfuse and make check pass; the new test fails before this change (event arrives with an empty name) and passes after.


Summary by cubic

Emit named IN_CREATE/IN_DELETE for directory watches by diffing a per-watch snapshot on each change and keeping the previous baseline if a snapshot fails to avoid spurious events. This fixes nameless directory events so fsnotify-based tools can match child names (fixes #55).

  • Bug Fixes
    • Maintain a per-watch dir snapshot; on NOTE_WRITE diff to emit named IN_CREATE/IN_DELETE; skip redundant bare events.
    • Only advance the baseline on a successful snapshot; keep the old baseline on failure; the initial add-watch snapshot is best-effort to avoid bogus deletes/creates.
    • Use one event translator for blocking and non-blocking paths; on overflow, queue IN_Q_OVERFLOW and advance the snapshot; add a regression test that asserts a named IN_CREATE is delivered.

Written for commit 7fd5187. Summary will update on new commits.

Review in cubic

EVFILT_VNODE reports that a watched directory changed but not which child, so
directory events were queued with no name. fsnotify-based consumers (notably
the k0s manifest applier, which re-applies only when a *.yaml entry appears)
filter on the entry name and silently drop nameless events, so manifests
written after the watch was established were never picked up.

Keep a per-watch snapshot of the directory's entry names; on each NOTE_WRITE
re-list the directory and diff against the snapshot to emit a named IN_CREATE
per added child and IN_DELETE per removed one, matching real inotify
semantics. The blocking-read and non-blocking collect paths share one
process_vnode_event() helper. The snapshot is allocated on add_watch and
freed on rm_watch and inotify_close.

Add a regression test (test-inotify Test 6) that watches a fresh directory,
creates a child, and asserts a named IN_CREATE for it is delivered; this
fails before the fix (the event arrives without a name).

Validated with make check on Apple Silicon.

(cherry picked from commit 2a5fa28b5257ceb5ed116c231aecc7c4a2ef54ab)
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/syscall/inotify.c">

<violation number="1" location="src/syscall/inotify.c:379">
P2: Filesystem I/O (opendir/readdir) under global inotify_lock causes potential lock contention and latency regression</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread src/syscall/inotify.c
Comment thread src/syscall/inotify.c
int queued = 0;
bool overflow = false;

if (w->is_dir && (fflags & NOTE_WRITE) && w->path) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Filesystem I/O (opendir/readdir) under global inotify_lock causes potential lock contention and latency regression

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/syscall/inotify.c, line 379:

<comment>Filesystem I/O (opendir/readdir) under global inotify_lock causes potential lock contention and latency regression</comment>

<file context>
@@ -296,8 +303,133 @@ static void pipe_drain(inotify_instance_t *inst)
+    int queued = 0;
+    bool overflow = false;
+
+    if (w->is_dir && (fflags & NOTE_WRITE) && w->path) {
+        char **now = NULL;
+        int now_n = 0;
</file context>

@jserv jserv requested a review from Max042004 May 29, 2026 10:08
A directory watch snapshots the child-name set on each change and diffs it
against the previous baseline to recover the name kqueue omits. dir_snapshot
returned an empty list on any failure (opendir error, a mid-read readdir
error, or an allocation failure), which the diff could not tell apart from a
genuinely empty directory.

On a transient failure the IN_DELETE pass then saw every known child as
missing and queued a spurious IN_DELETE for each, and the baseline was
overwritten with the empty list, so every later change re-reported the whole
directory as IN_CREATE. The corruption was permanent.

Make dir_snapshot return bool: on failure it frees any partial result and
reports false, and readdir read errors are now detected by clearing errno
before each call. process_vnode_event only diffs against and advances to a
snapshot that succeeded; on failure it keeps the previous baseline so the next
successful snapshot reconciles whatever changed in between. The watch-add path
documents that a failed initial snapshot is a best-effort empty baseline.

Validated with make check on Apple Silicon; test-inotify still passes 6/6.
@chenhunghan chenhunghan force-pushed the fix-inotify-named-events branch from 4aa4da1 to 7fd5187 Compare May 30, 2026 10:56
Comment thread src/syscall/inotify.c
* On failure keep the previous baseline; the next successful snapshot
* reconciles whatever changed in between.
*/
if (dir_snapshot(w->path, &now, &now_n)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale w->path reads the wrong directory after rename

The snapshot/diff design rests on w->path (captured by strdup(path) at inotify_add_watch time, src/syscall/inotify.c:594) being a valid handle to the directory whose entries the diff should compare. The watch's identity, however, is w->host_fd — an O_EVTONLY fd that follows the inode across renames. Once the two diverge, dir_snapshot(w->path, ...) at src/syscall/inotify.c:409 no longer reads the directory the watch is registered for.

I reproduced both failure modes with two small static aarch64 binaries cross-built and run against build/elfuse at this PR's tip (7fd5187).

Scenario A — old path no longer exists

mkdtemp /tmp/A
inotify_add_watch(fd, /tmp/A, IN_CREATE | IN_DELETE)
rename /tmp/A -> /tmp/A-renamed    /* watched inode moves with the fd */
touch /tmp/A-renamed/child         /* in the watched dir */

mv raises NOTE_RENAME on the watched fd (delivered as IN_MOVE_SELF, correct). The subsequent touch raises NOTE_WRITE on the same fd; process_vnode_event calls dir_snapshot("/tmp/A", ...), opendir returns ENOENT, the function returns false, the keep-baseline branch fires — and no IN_CREATE for child is ever emitted.

Actual output:

watching /tmp/elfuse-renA-KRUJg3 (wd=1)
renamed /tmp/elfuse-renA-KRUJg3 -> /tmp/elfuse-renA-KRUJg3-renamed; old path is now nonexistent
created /tmp/elfuse-renA-KRUJg3-renamed/child in the renamed (watched) dir

IN_CREATE name="child" count: 0 (expected 1 on Linux)
verdict: BUG REPRODUCED (named event lost after rename)

From the caller's perspective the watch silently stops naming children after a rename — exactly the failure mode #55 was opened to fix.

Scenario B — old path is taken by a different inode

mkdtemp /tmp/A
inotify_add_watch(fd, /tmp/A, IN_CREATE | IN_DELETE)
rename /tmp/A -> /tmp/A-renamed    /* watched inode moves with the fd */
mkdir /tmp/A                        /* fresh dir, different inode, unwatched */
touch /tmp/A/decoy                  /* in the new unwatched dir */
touch /tmp/A-renamed/child          /* in the watched dir */

Now opendir("/tmp/A") succeeds, but it reads the new /tmp/A, not the directory the watch tracks. The diff compares the new directory's entries (["decoy"]) to the baseline ([]) and synthesises an IN_CREATE for the decoy. The real child event in the renamed directory never surfaces because the watched dir is never sampled.

Actual output:

step 1: watching /tmp/elfuse-rename-A-nydrPl (wd=1)
step 2: rename /tmp/elfuse-rename-A-nydrPl -> /tmp/elfuse-rename-A-nydrPl-renamed (watched inode moves with the fd)
step 3: mkdir /tmp/elfuse-rename-A-nydrPl again (NEW inode at old path -- unwatched)
step 4a: created /tmp/elfuse-rename-A-nydrPl/decoy in the NEW (unwatched) dir
step 4b: created /tmp/elfuse-rename-A-nydrPl-renamed/child in the WATCHED (renamed) dir
step 5: draining events for ~1s
  event: wd=1 mask=0x00000100 (IN_CREATE) len=8 name=decoy

summary:
  total IN_CREATE: 1
  IN_CREATE name="decoy" (BUG if >0): 1
  IN_CREATE name="child" (expected: 1): 0
verdict: BUG REPRODUCED

The emitted event is objectively wrong on both axes: it names a file in a directory the caller never watched, and it suppresses the named event for the actual child of the watched directory. This isn't theoretical — build systems (yarn/pnpm/Cargo) and test runners routinely mv build → build.old; mkdir build; … for atomic swap; the watch on the old build/ will start fabricating events for the unrelated replacement directory after the swap.

Reproducers

Both binaries are small static aarch64 ELFs cross-built with aarch64-linux-gnu-gcc -static -O2 -D_GNU_SOURCE. Sources attached as Scenario A / Scenario B above; they exit non-zero on the buggy path. A correct fix should make both exit zero (Scenario A: emit IN_CREATE name="child"; Scenario B: emit IN_CREATE name="child" and not name="decoy").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

inotify directory watches deliver IN_CREATE/IN_DELETE events with an empty name[] field

2 participants