Skip to content

Auditing a dirty, *uninitialized* to-many collection initializes it mid-flush, corrupting subsequent collection reads in the same request#102

Merged
rcsofttech85 merged 3 commits into
rcsofttech85:mainfrom
nikola-jovanovic-php:fix/uninitialized-collection-init-during-audit
Jun 4, 2026
Merged

Auditing a dirty, *uninitialized* to-many collection initializes it mid-flush, corrupting subsequent collection reads in the same request#102
rcsofttech85 merged 3 commits into
rcsofttech85:mainfrom
nikola-jovanovic-php:fix/uninitialized-collection-init-during-audit

Conversation

@nikola-jovanovic-php

@nikola-jovanovic-php nikola-jovanovic-php commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Problem

When an #[Auditable] entity owns a lazy (EXTRA_LAZY) to-many association that is mutated while still uninitialized, the audit pass iterates the live PersistentCollection to compute the recorded ids. Iterating an uninitialized collection forces Doctrine's initialize() to run a SELECT against the pre-commit database state during flush(). The collection is hydrated from stale data and that stale snapshot sticks in the UnitOfWork, so later reads in the same request (->count(), iteration, lazy navigation) return incorrect pre-write values. It is silent in-memory corruption that only appears with auditing enabled. See #101 for the full write-up and reproduction.

Fix

The audit pass must never initialize a collection as a side effect. This turned out to affect two code paths, not one:

  1. CollectionChangeResolver::resolveCurrentCollectionIds() — the path reported in Auditing a dirty, *uninitialized* to-many collection initializes it mid-flush, corrupting subsequent collection reads in the same request #101. An already initialized PersistentCollection is iterated as before; an uninitialized one has its current ids computed from the snapshot plus the pending insert/delete diffs, so it is never forced to load.

  2. PendingAuditPlanMaterializer::materialize() — the deferred-collection post-flush path. It also iterated the live collection (extractFromIterable($currentValue)) to materialize collection ids, re-introducing the exact same initialization side effect for entities that route through a deferred audit plan. It now uses the same snapshot/diff resolution when the value is an uninitialized PersistentCollection.

The snapshot/diff logic is shared through a new CollectionIdExtractor::extractIdsWithoutInitializing() helper so both call sites stay consistent. Insert/delete diffs are still reflected in the recorded ids, so collection auditing behaviour is unchanged.

Tests

  • Unit (CollectionChangeResolverTest): drives buildCollectionTransition() with a real uninitialized PersistentCollection and asserts the ids come from snapshot + diffs and that the collection is left uninitialized. Fails on main, passes with the fix.
  • Functional (CollectionInitializationSafetyTest with LazyParent/LazyChild): an end-to-end regression on an EXTRA_LAZY inverse one-to-many. Children are added across several flushes and the test asserts count() keeps issuing a fresh SELECT COUNT(*) and reports the true total instead of a value frozen mid-flush. Exercises the deferred-plan path and is green only with the fix.

Verification

  • Full suite green (composer test): 918 tests.
  • composer stan (PHPStan) and composer cs (php-cs-fixer) report no issues.
  • Confirmed each new test fails without its corresponding fix and passes with it.

@codacy-production

codacy-production Bot commented Jun 4, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 10 complexity · 0 duplication

Metric Results
Complexity 10
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@nikola-jovanovic-php nikola-jovanovic-php marked this pull request as ready for review June 4, 2026 12:04
@rcsofttech85

Copy link
Copy Markdown
Owner

I am going to adjust the implementation before merging. The main reason is that, for an uninitialized PersistentCollection, Doctrine’s collection snapshot is not always a reliable source for the existing database rows. In the reported scenario, the collection may already have existing rows in the DB, while getSnapshot() can still be empty. So rebuilding the current IDs from snapshot + insertDiff - deleteDiff can miss existing IDs.

The safer approach is to reuse the already resolved old IDs as the base, then apply the collection insert/delete diffs:

currentIds = oldIds - deleteDiffIds + insertDiffIds

I will also keep the change scoped to CollectionChangeResolver and its unit coverage. That means I’m going to remove the added CollectionIdExtractor, PendingAuditPlanMaterializer, and functional fixture/test changes from this PR for now. Those broaden the change into deferred audit materialization and test-only entities, but the reported bug can be fixed cleanly in the resolver path without moving Doctrine collection diff logic into the ID extractor.

@rcsofttech85 rcsofttech85 force-pushed the fix/uninitialized-collection-init-during-audit branch from cc8b34d to 4749e34 Compare June 4, 2026 18:32
@rcsofttech85 rcsofttech85 merged commit bd887ee into rcsofttech85:main Jun 4, 2026
11 checks passed
@rcsofttech85

Copy link
Copy Markdown
Owner

@nikola-jovanovic-php

Thank You.

My earlier comment was about keeping #102 scoped and avoiding the snapshot/diff implementation for uninitialized PersistentCollection, because getSnapshot() is not always a reliable base.

After rechecking the deferred audit materialization path you pointed out, I agree that this is a separate real issue. I wanted to handle this as a follow-up bug for the deferred materialization path with the correct implementation, rather than mixing it into the resolver fix.

The follow-up fix will use Doctrine criteria loading after flush, so we can read the flushed collection IDs without hydrating the original PersistentCollection. I’ll make sure to credit you for identifying this path. Cheers!

rcsofttech85 added a commit that referenced this pull request Jun 5, 2026
…tion

Follow-up to #102.

Suggested-by: Nikola Jovanovic
Co-authored-by: Rahul Chavan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auditing a dirty, *uninitialized* to-many collection initializes it mid-flush, corrupting subsequent collection reads in the same request

2 participants