Skip to content

DRAFT: retire parent_data via replayed-model recording#67

Draft
navidemad wants to merge 1 commit into
Gusto:mainfrom
navidemad:retire-parent-data-draft
Draft

DRAFT: retire parent_data via replayed-model recording#67
navidemad wants to merge 1 commit into
Gusto:mainfrom
navidemad:retire-parent-data-draft

Conversation

@navidemad
Copy link
Copy Markdown

Draft / for discussion. I lean against merging this; see "Trade-offs".

What this explores

An alternative to #61. Instead of fixing the data_for for nil crash by lazily reading the parent's cached content, this removes the reason the child reads the parent's content during save at all: the parent_data parameter.

Today Cache#evaluate looks up fixture.parent.cache.content.data_for(coder.class) and passes it to coder.generate(parent_data:). The coder merges parent_data.keys into its captured model set, because the parent's rows are replayed (via parent.mount) under the log tag "FixtureKit Insert", which the sql.active_record subscriber can't attribute to a model.

This branch drops parent_data and has the coder record what it replays instead:

  • ActiveRecordCoder#mount adds data.keys to a per-generate @replayed_models set.
  • ActiveRecordCoder#generate folds @replayed_models into the captured set after the block runs.

It works because the runner reuses one coder instance and the parent is mounted inside the child's generate block. One side effect worth noting: the child no longer reads parent.cache.content during save, so the #61 crash can't occur on this path at all.

Trade-offs (why I lean against it)

  1. It swaps an explicit parameter for hidden, ordering-dependent state. parent_data is a visible input to generate. @replayed_models is an instance variable populated as a side effect of mount, with an implicit "mount runs inside generate's block" contract. Harder to reason about, easier to break.
  2. It changes the public Coder contract. generate(parent_data:) becomes generate(&block), so any custom coder that used parent_data now has to implement the record-on-mount, fold-on-generate dance itself.
  3. It doesn't actually simplify anything. The version that would delete the mechanism outright (tag each replayed INSERT with its model name so the subscriber recaptures it) means de-batching the one execute_batch per connection that active_record_coder_spec.rb pins. Worse trade.

So parent_data isn't cruft, it's the clean version. #66 documents why it stays; this branch is here so the alternative is concrete rather than hypothetical.

Test matrix

spec/unit/fixture_cache_spec.rb gets a propagation matrix that drives the real parent.mount to @replayed_models path (the existing inherited specs stub mount, so they don't):

  • a basic model carried across a "fresh process" (parent on disk, child saved)
  • STI parent models collapsed to their base table and carried (Car to Vehicle)
  • parent models from a second connection pool (ActivityLog on the analytics DB)
  • a delete-only parent model whose cached sql is nil, still carried forward

The old parent_data merge test in active_record_coder_spec.rb is rewritten against the new mechanism.

Test plan

  • bundle exec rspec (183/183)

Exploration / alternative to Gusto#61. Instead of lazily reading the parent's
cached content when a child is saved, remove the reason the child reads
it: the parent_data parameter.

ActiveRecordCoder#mount now records the models it replays into a
per-generate @replayed_models set; ActiveRecordCoder#generate folds that
set into its captured models after the block runs. This works because
the runner reuses one coder instance and the parent is mounted inside
the child's generate block. Cache#evaluate no longer reads
parent.cache.content, and the Coder contract drops the parent_data
keyword.

I lean against merging this — it trades an explicit parameter for hidden,
ordering-dependent instance state and changes the public Coder contract,
without buying a real simplification (per-model INSERT tagging would
require de-batching the one-execute_batch-per-connection call a spec
pins). Gusto#66 documents why parent_data stays. Opened as a draft so the
alternative is concrete.

Includes a propagation test matrix (basic cross-process, STI collapse,
second connection pool, delete-only nil sql) that drives the real
parent.mount -> @replayed_models path, and rewrites the coder's old
parent_data merge spec against the new mechanism.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant