Fix: drop misaligned version guards in the L2 swimlane pipeline#856
Open
indigo1973 wants to merge 1 commit into
Open
Fix: drop misaligned version guards in the L2 swimlane pipeline#856indigo1973 wants to merge 1 commit into
indigo1973 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the dependency graph generation schema from version 2 (v2) to version 3 (v3), introducing a strided tensor representation. Key changes include adding an args array with detailed tensor slice geometry to tasks, replacing raw_shapes with buffer_numel in the tensors schema, and replacing simple offsets with explicit start offsets and strides for both consumers and producers in the edges schema. Downstream tools, documentation, and tests have been updated to support and validate the new v3 schema. There are no review comments, so I have no feedback to provide.
deps.json and l2_perf_records.json both carry a "version" field, and both have grown out of sync with the consumers that gate on it: - deps.json: PR hw-native-sys#808 bumped v2 → v3 (strided Tensor: buffer_numel replaces raw_shapes, start_offset + strides[] replace multi-dim offset[]). swimlane_converter.load_deps_json kept the `if version != 2` guard, silently rejected every fresh capture, and fell back to L2PerfRecord::fanout[] — losing the race-window edges that dep_gen replay exists to recover. - l2_perf_records.json: its "version" field is actually L2PerfLevel (0=DISABLED, 1=AICORE_TIMING, ..., 3=SCHED_PHASES, 4=ORCH_PHASES), not a JSON schema version. swimlane_converter._print_verbose_data_info and sched_overhead_analysis.parse_scheduler_from_json_phases both gated phase output on `version == 2` / `version < 2`, which is wrong on both ends: phase blocks only exist at level >= 3, so v3/v4 captures silently skipped phase summaries while v2 captures probed for fields that aren't there. Bug fixes (semantics) - swimlane_converter.load_deps_json: drop the deps.json version guard entirely. This consumer only reads edges[].pred / .succ — both stable across every deps.json schema to date. Malformed individual edges are still skipped per-iteration. - swimlane_converter._print_verbose_data_info: drop the `version != 2` short-circuit; .get() already handles missing phase fields gracefully. - sched_overhead_analysis.parse_scheduler_from_json_phases: drop the `version < 2` short-circuit; the `if not phases_by_thread: return {}` guard immediately below was already the right check. Doc / comment cleanup — keep code, comments, and docs in sync per .claude/rules/doc-consistency.md: - swimlane_converter: 3 inline "(version 2)" annotations → explicit "(l2_perf_level >= 3)" / "(>= 4)". - sched_overhead_analysis: module docstring + run_analysis docstring + Pop fallback message replace "v2" / "version >= 2" wording with explicit l2_perf level. - {a2a3,a5}/runtime/tensormap_and_ringbuffer/docs/profiling_levels.md: the "v2 JSON" paragraph now spells out which file, which field, and which level is required. - dep_gen_replay.{h,cpp}: schema block + file brief + INFO log call out v3 fields (tasks[].args[], buffer_numel, start_offset, strides). - deps_to_graph.py / docs/dfx/dep_gen.md / simpler_setup/tools/README.md / test_dep_gen.py: residual "v2" labels updated to v3. (The load-bearing `version != 3` guard in deps_to_graph stays — that consumer reads strided-only fields.)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
deps.json and l2_perf_records.json both carry a "version" field, and
both have grown out of sync with the consumers that gate on it:
deps.json: PR Refactor: a2a3 + a5 Tensor to strided (stride + start_offset) model #808 bumped v2 → v3 (strided Tensor: buffer_numel
replaces raw_shapes, start_offset + strides[] replace multi-dim
offset[]). swimlane_converter.load_deps_json kept the
if version != 2guard, silently rejected every fresh capture, andfell back to L2PerfRecord::fanout[] — losing the race-window edges
that dep_gen replay exists to recover.
l2_perf_records.json: its "version" field is actually L2PerfLevel
(0=DISABLED, 1=AICORE_TIMING, ..., 3=SCHED_PHASES, 4=ORCH_PHASES),
not a JSON schema version. swimlane_converter._print_verbose_data_info
and sched_overhead_analysis.parse_scheduler_from_json_phases both
gated phase output on
version == 2/version < 2, which is wrongon both ends: phase blocks only exist at level >= 3, so v3/v4 captures
silently skipped phase summaries while v2 captures probed for fields
that aren't there.
Bug fixes (semantics)
entirely. This consumer only reads edges[].pred / .succ — both stable
across every deps.json schema to date. Malformed individual edges are
still skipped per-iteration.
version != 2short-circuit; .get() already handles missingphase fields gracefully.
version < 2short-circuit; theif not phases_by_thread: return {}guard immediately below was already the right check.
Doc / comment cleanup — keep code, comments, and docs in sync per
.claude/rules/doc-consistency.md:
"(l2_perf_level >= 3)" / "(>= 4)".
explicit l2_perf level.
the "v2 JSON" paragraph now spells out which file, which field, and
which level is required.
out v3 fields (tasks[].args[], buffer_numel, start_offset, strides).
/ test_dep_gen.py: residual "v2" labels updated to v3. (The
load-bearing
version != 3guard in deps_to_graph stays — thatconsumer reads strided-only fields.)