Lazy-load parent cache content when saving a child fixture#61
Lazy-load parent cache content when saving a child fixture#61navidemad wants to merge 5 commits into
Conversation
When a child fixture extends a parent, `Cache#evaluate` reads the
parent's coder data with:
fixture.parent.cache.content.data_for(coder.class)
But `@content` is only populated by `Cache#load` (full mount) or
`Cache#save`. If the parent was generated in a previous process (cache
file on disk) but never mounted in the current one — which happens when
the child cache is missing while the parent cache exists — the child
goes straight to `save` and `parent.cache.content` is `nil`, raising:
NoMethodError: undefined method 'data_for' for nil
This is reproducible by clearing only the anonymous child cache (or
modifying the child definition so its identifier changes) while
preserving the parent cache. It also hits any setup where the test DB
has been reset (e.g. `db:test:prepare`) and the parent cache survives,
which is common when `FIXTURE_KIT_PRESERVE_CACHE=1` is used to support
parallel workers.
Extract the lazy read into a new `Cache#read_content` method so
`evaluate` (and `load`) can populate `@content` on demand without
triggering a full mount.
Add a spec that reproduces the original crash: save a parent, call
`clear_memory` to simulate a fresh process, then save a child that
depends on the parent. Without the fix the spec raises `NoMethodError`.
Refs yespark/yespark-rails#21339 for the
downstream workaround that motivated this upstream fix.
Fresh Eyes ReviewFound 1 issue in this PR. PR Description Issues
Download findings.json — drag the file into Claude or use 🙏🏽 Please 👍🏽 👎🏽 if you found this useful. Generated by Fresh Eyes Reviewer. Get help in #ai-code-reviews |
The safe-nav chain `parent&.cache&.read_content&.data_for(...)` was over-defensive: `Cache` is never nil when `parent` exists, and `read_content` either returns a `MemoryCache` or raises (it cannot return nil). Reverting to the original explicit `parent ? ... : nil` preserves the pre-fix semantics — nil only when there's no parent; any other failure surfaces loudly.
- Rename `read_content` to `ensure_content`. The `read_` prefix suggested a pure read, hiding the memoization side effect. The new name reflects the actual contract: ensure `@content` is populated, reading from disk only if necessary. Docstring also calls out that the method raises when the cache file is absent — callers must guarantee existence. - Strengthen the cross-process reproducer spec. The previous stub of `parent.mount` recreated `User` inside the child save block, so `ActiveRecordCoder` captured `User` from the event stream regardless of `parent_data`. A faux fix that returned `parent_data: nil` would still satisfy the assertion. The stub now returns an empty repository (User already lives in DB from `parent_cache.save`), so `User` only ends up in the child cache via `parent_data.keys` — making the assertion load-bearing. Also adds an explicit `expect(parent_cache.content).to be_nil` precondition right after `clear_memory`. - Add three focused specs for `#ensure_content`: returns memoized `@content` without touching disk, re-reads disk after `clear_memory`, and raises `Errno::ENOENT` when the cache file is absent. Formalizes the contract instead of inheriting it implicitly from `File.read`.
The maintainer reads English, so French inline comments in the cache spec are noise. Translates the cross-process reproducer comments and the new #ensure_content memoization comment. No behavior change.
| # needs the parent's coder data, while the parent itself was already cached | ||
| # to disk in a previous process and has not yet been mounted). Raises if the | ||
| # cache file is absent or unreadable — callers must guarantee existence. | ||
| def ensure_content |
There was a problem hiding this comment.
Can this be just be named content?
There was a problem hiding this comment.
content is already an attr_reader (line 9) with the opposite contract: it returns @content directly — nil when nothing is mounted, never touching disk and never raising. exists? depends on that (content || file_cache.exists?), and a few specs assert it directly (expect(cache.content).to be_nil after clear_memory).
ensure_content is the lazy sibling — @content ||= file_cache.read — which memoizes and raises Errno::ENOENT when the file is missing. I named it for that side effect so the pure reader stays distinguishable from the disk-loading one.
I can collapse the two into a single lazy content if you'd prefer one method (pointing exists? at @content and reworking the nil-after-clear_memory specs), just flagging it's a behavior merge rather than only a rename. Happy to go either way — let me know.
Per review feedback, fold the separate `ensure_content` method into
`content` itself, making `content` lazily read from disk and memoize:
def content
@content ||= file_cache.read
end
This shrinks the public surface to a single accessor and restores the
original, natural call site in `evaluate`:
fixture.parent.cache.content.data_for(coder.class)
The only place that relied on `content` returning nil without touching
disk is `exists?`, which now reads the `@content` ivar directly so it
never triggers a disk read. Every other reader is already guarded by
`exists?` (load) or sets `@content` first (save), so making `content`
raise on a missing/corrupt file keeps failures loud exactly where the
old code surfaced them.
Specs that asserted the in-memory state via `content` (the four
`content == nil` checks after `clear_memory`, plus the cross-process
precondition) now assert `@content` directly, since `content` would
lazily re-read from disk. The `#ensure_content` examples move under
`#content` unchanged in intent.
No behavior change for callers: load/save/evaluate semantics are
identical; only the method name and the exists? internals change.
Summary
Cache#evaluatereads the parent's coder data viafixture.parent.cache.content.data_for(...), but@contentis only populated byCache#load(full mount) orCache#save. When a child fixture is generated in a fresh process where the parent has a cache file on disk but has never been mounted in this process, the child jumps straight tosaveandparent.cache.contentisnil, raising:This happens in any setup where the test DB is reset (e.g.
db:test:prepare) while preserving the fixture cache. Common whenFIXTURE_KIT_PRESERVE_CACHE=1is used to support parallel workers.Repro
The spec
loads parent content from disk when child saves without parent in memoryreproduces it: save a parent fixture, callclear_memoryto simulate a fresh process, then triggerchild_cache.save. Without the patch, the spec raisesNoMethodError.Fix
Extract the lazy
@content ||= file_cache.readinto a new publicCache#ensure_contentmethod.evaluatenow usesfixture.parent ? fixture.parent.cache.ensure_content.data_for(...) : nil, so it transparently loads the parent's content from disk when needed.loadis refactored to use the same helper.The method is named
ensure_contentrather thanread_contentbecause it has a memoization side effect;read_would have suggested a pure read. It raises when the cache file is absent, so callers must guarantee existence.Review iteration
A reviewer flagged a safe-nav chain (
parent&.cache&.ensure_content&.data_for(...)) on an earlier revision, arguing it could silently swallow a missing or corrupt parent cache intoparent_data: nil. The premise didn't hold:FileCache#readalways raises (Errno::ENOENT,JSON::ParserError) rather than returning nil. The chain was still over-defensive, so I tightened it back to the ternary form to match the pre-fix semantics: nil only when there is no parent, anything else surfaces loudly.I also strengthened the cross-process reproducer. Previously, the
mountstub recreated the parent'sUserinside the child save block, soActiveRecordCodercapturedUserfrom the event stream regardless ofparent_data. A faux fix returningparent_data: nilwould still have satisfied the assertion. The stub now returns an empty repository (the user already lives in the DB fromparent_cache.save), soUsercan only appear in the child cache viaparent_data.keys.I added three focused unit specs for
#ensure_content: returns memoized@contentwithout touching disk, re-reads disk afterclear_memory, raisesErrno::ENOENTwhen the file is absent.Notes
Cache#contentandCache#loadsemantics are unchanged.ensure_contenton the parent cache double.Test plan
bundle exec rspec spec/unit/fixture_cache_spec.rb— 28/28 passbundle exec rspec(full suite) — 178/178 pass