Problem
parser.py (1,178 lines) has 5 distinct responsibilities mixed together:
- Event parsing
- Session discovery (filesystem scanning, plan.md probing)
- Multi-layer caching (3 cache types, 5 mutable module globals, complex invalidation)
- Summary building (first pass, resume detection, active/completed paths)
- Orchestration (
get_all_sessions — 179 lines doing 7 things)
The caching and discovery concerns are self-contained and separable.
Proposed Extraction
cache.py (~150 lines)
- Cache dataclasses:
_DiscoveryCache, _CachedSession, _CachedEvents, _SortedSessionsCache
- Module-level state:
_SESSION_CACHE, _EVENTS_CACHE, _DISCOVERY_CACHE, _sorted_sessions_cache, _config_file_id
- Insert/evict/fingerprint helpers
get_cached_events()
discovery.py (~250 lines)
_full_scandir_discovery() — filesystem scanning
_discover_with_identity() — identity tracking + plan.md probing
discover_sessions() — public API wrapper
parser.py stays as public façade (~700 lines)
- Event parsing, config reading, summary building pipeline
build_session_summary() and get_all_sessions() re-exported
- Public API unchanged — no import changes for consumers
Risk
get_all_sessions() orchestrates caching + discovery + parsing. It will need to call into both extracted modules — that's the trickiest seam.
Testing
All existing tests must pass unchanged. No public API changes.
Note
Do not attempt via pipeline — this is a structural refactor that requires manual coordination.
Problem
parser.py(1,178 lines) has 5 distinct responsibilities mixed together:get_all_sessions— 179 lines doing 7 things)The caching and discovery concerns are self-contained and separable.
Proposed Extraction
cache.py(~150 lines)_DiscoveryCache,_CachedSession,_CachedEvents,_SortedSessionsCache_SESSION_CACHE,_EVENTS_CACHE,_DISCOVERY_CACHE,_sorted_sessions_cache,_config_file_idget_cached_events()discovery.py(~250 lines)_full_scandir_discovery()— filesystem scanning_discover_with_identity()— identity tracking + plan.md probingdiscover_sessions()— public API wrapperparser.pystays as public façade (~700 lines)build_session_summary()andget_all_sessions()re-exportedRisk
get_all_sessions()orchestrates caching + discovery + parsing. It will need to call into both extracted modules — that's the trickiest seam.Testing
All existing tests must pass unchanged. No public API changes.
Note
Do not attempt via pipeline — this is a structural refactor that requires manual coordination.