fix(attractor): correct inverted community hierarchy (#33)#37
Merged
Conversation
The fine partition was returning fewer (and larger) clusters than coarse because three settings were shared across β levels: top-K=50 neighbours, always-on singleton merging, and identical softmax initialisation. At high β the shared top-K funnelled notes through the same hub attractors and merge_singletons swallowed the narrow basins that fine is supposed to expose. Changes: - Per-level top-K: TOP_K_COARSE=80, TOP_K_FINE=20; plumbed through _attractor_convergence so each pass has an appropriate neighbourhood. - Singleton-merge gate: _assign_communities(merge_singletons=bool); the fine pass calls it with False so 1-note basins survive. - Modularity + size stats: _modularity() computes weighted Newman Q on the pre-sparsification S; _size_stats() reports size distribution. detect_communities logs a warning if Q(fine) <= Q(coarse) so a future regression is visible at index time. - Per-level stats persisted in new community_level_stats table (migration v13 -> v14). vault_stats now emits a community_levels array with level, label, n_communities, min/max/mean_size, modularity. - Tests cover top_k override, merge gate, modularity (2-block positive, single-community zero, degenerate zero), size stats.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the inverted community hierarchy reported in #33, where
vault_statsreturnedcommunities_coarse=12, communities_fine=8(fine had fewer, larger basins than coarse).Three compounding issues drove the inversion: (1) a shared
TOP_K_NEIGHBORS=50at both β levels funnelled notes through the same hub attractors at high β, (2) singleton merging absorbed the narrow 1-note basins that fine is supposed to expose, (3) only β differed between levels.Changes
TOP_K_COARSE=80,TOP_K_FINE=20) plumbed through_attractor_convergenceso each pass uses an appropriate neighbourhood._assign_communities(merge_singletons=bool); the fine pass calls it withFalseso narrow basins survive._modularity()computes weighted Newman Q on pre-sparsification S;detect_communitieslogs a warning ifQ(fine) <= Q(coarse)so a future regression is visible at index time.community_level_statstable (schema v13 → v14) stores n, min/max/mean size, modularity per level.vault_statsoutput — addscommunity_levels: [...]array (legacycommunities_coarse/fine/summarizedpreserved).Results on production vault (539 notes)
n_fine > n_coarseThe modularity sanity warning fires honestly —
Q(fine)=0.0527 < Q(coarse)=0.0628. The count hierarchy is correct but weighted modularity still favours coarse, a known property with heavy-tailed edge weights. Surfaced for future investigation rather than silently hidden.Test plan
uv run pytest tests/)vault_statsover MCP returns new block,vault_communitiesat both levels returns coherent results with narrow basins (3-note, 7-note) surviving at fine levelNotes
coarse/fine→broad/narrow) deferred. The modularity warning will signal at index time if the hierarchy tuning fails to deliver on any given vault; revisit the rename only if the warning persists.Closes #33