diff --git a/.kilo/command/benchmarks-affected.md b/.kilo/command/benchmarks-affected.md
new file mode 100644
index 0000000..158cb9e
--- /dev/null
+++ b/.kilo/command/benchmarks-affected.md
@@ -0,0 +1,34 @@
+---
+description: Scan current branch and report impacted benchmark targets/functions.
+---
+
+# Benchmarks Affected
+
+Identify which benchmark binaries and benchmark functions are affected by changes on the current branch.
+
+Use the `benchmarks-affected` skill as the single source of truth for workflow details and guardrails.
+Do not duplicate or override the skill instructions in this command.
+
+## Inputs
+
+- Optional `--baseline <ref>` (default: `main`)
+- Optional `--compile-commands <path>`
+- Optional `--no-include-working-tree`
+- Optional `--format <text|json>` (default: `text`)
+
+## Workflow
+
+1. Execute the `benchmarks-affected` skill workflow.
+2. Pass through command inputs to the analyzer invocation defined by the skill.
+3. Report results with these sections:
+   - Changed files
+   - Affected benchmark targets
+   - Affected benchmark functions
+   - Suggested `--benchmark_filter` regex
+   - Any warnings/failures
+
+## Output rules
+
+1. If `affected_benchmarks` is non-empty, prioritize those names.
+2. If `affected_benchmarks` is empty but benchmark targets are affected, mark result as partial and include target-level impact.
+3. Do not run full benchmark suites in this command; this command is for impact discovery only.
diff --git a/.kilo/command/perf-review.md b/.kilo/command/perf-review.md
index 3fff142..4b1ab95 100644
--- a/.kilo/command/perf-review.md
+++ b/.kilo/command/perf-review.md
@@ -19,119 +19,43 @@ Non-negotiable requirements:
 If arguments are omitted:
 - Default target branch to PR base branch from `gh pr view --json baseRefName` when available.
 - Fall back target branch to `main`.
-- Default filter must be **targeted**, not full-suite:
-  - Derive from changed files and changed symbols.
-  - If `include/pixie/bitvector.h` changed in select path, default to `BM_Select` and add `BM_RankNonInterleaved` as control.
-  - Run full selected suites only as last resort when mapping fails.
-
-## Step 1 - Resolve Branches and Revisions
-
-1. Identify contender branch and hash:
-   - Contender branch: current checked-out branch (or `HEAD` if detached).
-   - Contender hash: `git rev-parse --short HEAD`.
-2. Identify baseline branch:
-   - Use `--target` if provided.
-   - Else use PR base branch from GitHub CLI when available.
-   - Else use `main`.
-3. Resolve baseline hash with `git rev-parse --short <baseline-ref>`.
-4. Print branch and hash mapping before running benchmarks.
-
-## Step 2 - Select Relevant Benchmark Binaries
-
-Inspect changed files with:
-
-`git diff --name-only <baseline-ref>...HEAD`
-
-Map file paths to benchmark binaries:
-
-| Changed path pattern | Benchmark binary | Coverage |
-|---|---|---|
-| `include/pixie/bitvector*`, `include/*bit_vector*`, `include/interleaved*` | `benchmarks` | BitVector rank/select |
-| `include/rmm*` | `bench_rmm` | RmM tree operations |
-| `include/louds*` | `louds_tree_benchmarks` | LOUDS traversal |
-| `include/simd*`, `include/aligned*` | `alignment_comparison` | SIMD and alignment |
-| `include/misc/*` | all relevant | Differential helpers |
-| `CMakeLists.txt`, benchmark infra, broad/unknown changes | all benchmarks | Conservative full run |
-
-Available benchmark binaries:
-- `benchmarks`
-- `bench_rmm`
-- `bench_rmm_sdsl`
-- `louds_tree_benchmarks`
-- `alignment_comparison`
-
-If the mapping is ambiguous, run all benchmark binaries but still apply a focused filter first.
-If `--filter` is provided, pass it through as `--benchmark_filter`.
-Print selected binaries and why they were selected.
+
+Filter handling:
+- If `--filter` is provided, pass it through.
+- Else use the filter produced by `benchmarks-affected` through `benchmarks-compare-revisions`.
+- If no filter can be derived, run conservative full-binary compare for impacted binaries.
+
+## Step 1 - Resolve branches and hashes
+
+1. Resolve contender from current checkout (`HEAD`) and compute short hash.
+2. Resolve baseline branch using precedence: `--target` -> PR base from `gh pr view --json baseRefName` -> `main`.
+3. Resolve baseline short hash.
+4. Print branch/hash mapping before benchmark execution.
+
+## Step 2 - Run timing comparison via skill (single source of truth)
+
+Use `benchmarks-compare-revisions` as the single source of truth for revision builds, benchmark scope, compare.py flow, retry policy, and guardrails.
+
+Do not duplicate or override its internal build/run steps in this command.
+
+Pass-through inputs:
+- Baseline ref/hash from Step 1.
+- Contender ref/hash from Step 1.
+- Optional `--filter` override.
+
+Consume outputs from `benchmarks-compare-revisions`:
+- Baseline and contender benchmark JSON artifacts.
+- compare.py output per binary.
+- Effective filter used.
+- Scope metadata from `benchmarks-affected` (`affected_benchmark_targets`, `affected_benchmarks`) when available.
 
 Execution guardrails:
-- Do not use background jobs (`nohup`, `&`) for benchmark runs in CI.
-- Do not interleave multiple benchmark runs into one shell command stream.
-- Run one benchmark command at a time and wait for completion.
-
-## Step 3 - Build Both Revisions (Timing and Profiling Builds)
-
-Use isolated build directories per short hash.
-
-1. Capture original ref (`git rev-parse --abbrev-ref HEAD` or detached `HEAD`).
-2. If worktree is dirty, stash safely with untracked files:
-   - `git stash push -u -m "perf-review-auto-stash"`
-3. Build baseline revision:
-   - `git checkout <baseline-hash-or-ref>`
-   - Timing build (required):
-     - `cmake -B build/benchmarks-all_bench_<baseline_hash> -DCMAKE_BUILD_TYPE=Release -DPIXIE_BENCHMARKS=ON`
-     - `cmake --build build/benchmarks-all_bench_<baseline_hash> --config Release -j`
-   - Profiling build (Linux only, recommended):
-     - `cmake -B build/benchmarks-diagnostic_bench_<baseline_hash> -DCMAKE_BUILD_TYPE=RelWithDebInfo -DPIXIE_BENCHMARKS=ON -DBENCHMARK_ENABLE_LIBPFM=ON -DPIXIE_DIAGNOSTICS=ON`
-     - `cmake --build build/benchmarks-diagnostic_bench_<baseline_hash> --config RelWithDebInfo -j`
-4. Build contender revision:
-   - `git checkout <contender-hash-or-original-ref>`
-   - Repeat timing and profiling build with contender hash suffix.
-5. Restore original ref and restore stashed state if a stash was created.
-
-Critical guardrails:
-- Never use Debug binaries for timing review.
-- Timing comparisons must use `benchmarks-all` Release builds.
-- Profiling counters should use `benchmarks-diagnostic` RelWithDebInfo builds.
-
-## Step 4 - Resolve Binary Paths
-
-Support both generator layouts:
-
-- Multi-config: `build/<dir>/Release/<binary>` or `build/<dir>/RelWithDebInfo/<binary>`
-- Single-config: `build/<dir>/<binary>`
-
-For each needed binary, detect the existing executable path before running.
-If a required binary is missing, report failure and stop with a blocked verdict.
-
-## Step 5 - Run Timing Comparison (Primary Judgment)
-
-Use a deterministic JSON-first workflow. Do not rely on long-running `compare.py` binary-vs-binary mode.
-
-1. Verify Python benchmark tooling once before runs:
-   - `python3 -c "import numpy, scipy"`
-2. For each selected benchmark binary, run baseline then contender sequentially, each with explicit JSON out:
-   - `--benchmark_filter="<filter>"`
-   - `--benchmark_format=json`
-   - `--benchmark_out=<file>.json`
-   - `--benchmark_report_aggregates_only=true`
-   - `--benchmark_display_aggregates_only=true`
-3. Suppress benchmark stdout/stderr noise when generating JSON artifacts so files stay valid:
-   - `> <file>.log 2>&1`
-4. Validate both JSON files before comparison:
-   - `python3 -m json.tool <file>.json > /dev/null`
-5. Compare using one of:
-   - `python3 <compare.py> -a benchmarks <baseline.json> <contender.json>`
-   - or a deterministic local Python diff script over aggregate means.
-6. Keep raw JSON files and comparison output for auditability.
-
-Timeout and retry policy:
-- Use command timeouts that match benchmark scope.
-- If a run times out once, narrow filter immediately and retry once.
-- Maximum retry count per benchmark group: 1.
-- If still timing out, produce a blocked/partial verdict with explicit scope limitations.
-
-## Step 6 - Collect Hardware Counter Profiles (Linux Only)
+- Run benchmarks sequentially.
+- No background jobs (`nohup`, `&`).
+- Use Release timing builds only.
+- If timing comparison fails, return blocked verdict with exact failure points.
+
+## Step 3 - Collect hardware counter profiles (Linux only, optional)
 
 Run a preflight first to avoid wasted attempts:
 1. Execute one tiny benchmark with perf counters (e.g. one benchmark case) and inspect output for counter availability.
@@ -157,7 +81,7 @@ Compute derived metrics when denominators are non-zero:
 
 If profiling is unavailable (non-Linux, libpfm missing, or perf permissions blocked), continue with timing-only review and explicitly mark profiling as unavailable in the report.
 
-## Step 7 - Analyze Timing and Counter Data
+## Step 4 - Analyze timing and counter data
 
 Timing classification per benchmark entry:
 - Improvement: time delta < -5%
@@ -181,7 +105,7 @@ Noise-control expectations:
 - Include at least one control benchmark family expected to be unaffected by the code change.
 - Treat isolated swings without pattern as noise unless reproduced across related sizes/fill ratios.
 
-## Step 8 - Produce Final Markdown Report
+## Step 5 - Produce final markdown report
 
 Return a structured markdown report with this shape:
 
diff --git a/.kilo/skills/benchmarks-affected/SKILL.md b/.kilo/skills/benchmarks-affected/SKILL.md
new file mode 100644
index 0000000..e886c92
--- /dev/null
+++ b/.kilo/skills/benchmarks-affected/SKILL.md
@@ -0,0 +1,77 @@
+---
+name: benchmarks-affected
+description: Analyze current branch versus a baseline and extract affected benchmark targets and benchmark functions using compile_commands and clang AST.
+---
+
+# Benchmarks Affected Skill
+
+Use this skill to identify exactly which benchmark binaries and benchmark functions are affected by code changes on the current branch.
+
+It implements a two-stage workflow:
+
+1. `compile_commands.json` analysis to determine affected compile targets.
+2. Clang AST analysis to determine affected benchmark functions.
+
+## Goal
+
+Given `HEAD` and a baseline branch (default `main`), produce:
+
+- Changed files.
+- Affected targets (with emphasis on benchmark targets).
+- Exact benchmark functions impacted by the changes.
+- A ready-to-use Google Benchmark filter regex.
+
+## Prerequisites
+
+1. Build tree with benchmarks enabled and compile database exported:
+
+```bash
+BUILD_SUFFIX=local
+cmake -B build/benchmarks-all_${BUILD_SUFFIX} \
+  -DCMAKE_BUILD_TYPE=Release \
+  -DPIXIE_BENCHMARKS=ON \
+  -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
+cmake --build build/benchmarks-all_${BUILD_SUFFIX} --config Release -j
+```
+
+2. `clang++` must be available on `PATH` (used for AST dump).
+
+## Run
+
+```bash
+python3 .kilo/skills/benchmarks-affected/analyze_benchmarks_affected.py \
+  --baseline main \
+  --compile-commands build/benchmarks-all_local/compile_commands.json \
+  --format json
+```
+
+If `--compile-commands` is omitted, the script auto-selects the most recently modified `build/**/compile_commands.json`.
+Working tree changes are included by default. Use `--no-include-working-tree` to restrict analysis to `<baseline>...HEAD` only.
+
+## Output
+
+The analyzer reports:
+
+- `affected_targets`: impacted CMake targets inferred from compile dependency analysis.
+- `affected_benchmark_targets`: subset of benchmark binaries impacted.
+- `affected_benchmarks`: precise benchmark function names from AST-level call analysis.
+- `suggested_filter_regex`: regex to pass as `--benchmark_filter`.
+
+## How to Use Findings
+
+1. Build only impacted benchmark binaries where feasible.
+2. Run benchmark binaries with the suggested filter:
+
+```bash
+FILTER='^(BM_RankNonInterleaved|BM_SelectNonInterleaved)(/|$)'
+build/benchmarks-all_local/benchmarks --benchmark_filter="${FILTER}"
+```
+
+3. If impact mapping is broad/uncertain, run full binary for selected benchmark target(s).
+
+## Guardrails
+
+1. Keep baseline comparison at merge-base style diff: `<baseline>...HEAD`.
+2. Use Release binaries for timing runs.
+3. If AST parse fails for a TU, still trust compile target impact and mark benchmark-function scope as partial.
+4. If benchmark infra (`CMakeLists.txt`, benchmark source layout) changed, fall back to conservative benchmark selection.
diff --git a/.kilo/skills/benchmarks-affected/analyze_benchmarks_affected.py b/.kilo/skills/benchmarks-affected/analyze_benchmarks_affected.py
new file mode 100644
index 0000000..1aa3b4a
--- /dev/null
+++ b/.kilo/skills/benchmarks-affected/analyze_benchmarks_affected.py
@@ -0,0 +1,988 @@
+#!/usr/bin/env python3
+
+from __future__ import annotations
+
+import argparse
+import json
+import re
+import shlex
+import shutil
+import subprocess
+import sys
+from collections import defaultdict
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+
+KNOWN_BENCHMARK_TARGETS = {
+    "benchmarks",
+    "bench_rmm",
+    "bench_rmm_sdsl",
+    "louds_tree_benchmarks",
+    "alignment_comparison",
+}
+
+BUILD_INFRA_FILES = {
+    "CMakeLists.txt",
+    "CMakePresets.json",
+}
+
+DIFF_HUNK_RE = re.compile(r"^@@ -\d+(?:,\d+)? \+(\d+)(?:,(\d+))? @@")
+
+CPP_FUNCTION_START_RE = re.compile(
+    r"^\s*"
+    r"(?:template\s*<[^>]*>\s*)?"
+    r"(?:(?:inline|constexpr|consteval|constinit|static|friend|virtual|explicit)\s+)*"
+    r"[A-Za-z_~][\w:<>,\s\*&\[\]]*\s+"
+    r"([~A-Za-z_][A-Za-z0-9_]*)\s*"
+    r"\([^;{}]*\)\s*"
+    r"(?:const\s*)?"
+    r"(?:noexcept\s*)?"
+    r"(?:->\s*[^\{]+)?\{"
+)
+
+
+@dataclass
+class CompileCommandEntry:
+    directory: Path
+    source: Path
+    arguments: list[str]
+    output: Path | None
+    target: str | None
+    dependencies: set[Path] = field(default_factory=set)
+    dep_error: str | None = None
+
+
+@dataclass
+class AstImpactResult:
+    benchmark_names: set[str] = field(default_factory=set)
+    affected_names: set[str] = field(default_factory=set)
+    ast_error: str | None = None
+
+
+def run_command(
+    args: list[str],
+    cwd: Path,
+    check: bool = True,
+) -> subprocess.CompletedProcess[str]:
+    return subprocess.run(
+        args,
+        cwd=str(cwd),
+        text=True,
+        capture_output=True,
+        check=check,
+    )
+
+
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(
+        description=(
+            "Analyze benchmark impact between baseline and HEAD via "
+            "compile_commands dependency mapping and clang AST analysis."
+        )
+    )
+    parser.add_argument(
+        "--baseline",
+        default="main",
+        help="Baseline ref used as <baseline>...HEAD (default: main).",
+    )
+    parser.add_argument(
+        "--head",
+        default="HEAD",
+        help="Contender ref (default: HEAD).",
+    )
+    parser.add_argument(
+        "--compile-commands",
+        default=None,
+        help=(
+            "Path to compile_commands.json. If omitted, auto-discovers most "
+            "recent build/**/compile_commands.json."
+        ),
+    )
+    parser.add_argument(
+        "--clangxx",
+        default=None,
+        help="clang++ executable for AST dump (auto-detected if omitted).",
+    )
+    parser.add_argument(
+        "--format",
+        choices=["text", "json"],
+        default="text",
+        help="Output format (default: text).",
+    )
+    parser.add_argument(
+        "--include-working-tree",
+        dest="include_working_tree",
+        action="store_true",
+        default=True,
+        help=(
+            "Include local unstaged/staged changes in changed-files set, "
+            "in addition to <baseline>...<head> (default: enabled)."
+        ),
+    )
+    parser.add_argument(
+        "--no-include-working-tree",
+        dest="include_working_tree",
+        action="store_false",
+        help="Disable working-tree inclusion and only analyze <baseline>...<head>.",
+    )
+    return parser.parse_args()
+
+
+def git_repo_root() -> Path:
+    proc = run_command(["git", "rev-parse", "--show-toplevel"], cwd=Path.cwd())
+    return Path(proc.stdout.strip()).resolve()
+
+
+def resolve_compile_commands(repo_root: Path, explicit_path: str | None) -> Path:
+    if explicit_path:
+        compile_path = Path(explicit_path)
+        if not compile_path.is_absolute():
+            compile_path = (repo_root / compile_path).resolve()
+        if not compile_path.exists():
+            raise FileNotFoundError(f"compile_commands.json not found: {compile_path}")
+        return compile_path
+
+    candidates = sorted(
+        repo_root.glob("build/**/compile_commands.json"),
+        key=lambda path: path.stat().st_mtime,
+        reverse=True,
+    )
+    if not candidates:
+        raise FileNotFoundError(
+            "No compile_commands.json found under build/**. "
+            "Configure with -DCMAKE_EXPORT_COMPILE_COMMANDS=ON first."
+        )
+    return candidates[0].resolve()
+
+
+def load_compile_commands(compile_commands_path: Path) -> list[CompileCommandEntry]:
+    entries: list[CompileCommandEntry] = []
+    data = json.loads(compile_commands_path.read_text(encoding="utf-8"))
+    for raw_entry in data:
+        directory = Path(raw_entry["directory"]).resolve()
+
+        raw_source = Path(raw_entry["file"])
+        if raw_source.is_absolute():
+            source = raw_source.resolve()
+        else:
+            source = (directory / raw_source).resolve()
+
+        if "arguments" in raw_entry:
+            arguments = [str(arg) for arg in raw_entry["arguments"]]
+        else:
+            arguments = shlex.split(raw_entry["command"])
+
+        output = infer_output_path(arguments, directory)
+        target = infer_cmake_target_from_output(output)
+
+        entries.append(
+            CompileCommandEntry(
+                directory=directory,
+                source=source,
+                arguments=arguments,
+                output=output,
+                target=target,
+            )
+        )
+    return entries
+
+
+def infer_output_path(arguments: list[str], directory: Path) -> Path | None:
+    output_token: str | None = None
+    for idx, arg in enumerate(arguments):
+        if arg == "-o" and idx + 1 < len(arguments):
+            output_token = arguments[idx + 1]
+        elif arg.startswith("-o") and len(arg) > 2:
+            output_token = arg[2:]
+        elif arg.startswith("/Fo") and len(arg) > 3:
+            output_token = arg[3:]
+
+    if output_token is None:
+        return None
+
+    out_path = Path(output_token)
+    if out_path.is_absolute():
+        return out_path.resolve()
+    return (directory / out_path).resolve()
+
+
+def infer_cmake_target_from_output(output: Path | None) -> str | None:
+    if output is None:
+        return None
+    parts = output.parts
+    for index, part in enumerate(parts):
+        if part == "CMakeFiles" and index + 1 < len(parts):
+            target_part = parts[index + 1]
+            if target_part.endswith(".dir"):
+                return target_part[: -len(".dir")]
+            return target_part
+    return None
+
+
+def git_changed_files(repo_root: Path, baseline: str, head: str) -> set[Path]:
+    diff_range = f"{baseline}...{head}"
+    proc = run_command(["git", "diff", "--name-only", diff_range], cwd=repo_root)
+    changed_files: set[Path] = set()
+    for line in proc.stdout.splitlines():
+        line = line.strip()
+        if not line:
+            continue
+        changed_files.add((repo_root / line).resolve())
+    return changed_files
+
+
+def git_working_tree_changed_files(repo_root: Path) -> set[Path]:
+    changed_files: set[Path] = set()
+    commands = [
+        ["git", "diff", "--name-only"],
+        ["git", "diff", "--name-only", "--cached"],
+    ]
+    for cmd in commands:
+        proc = run_command(cmd, cwd=repo_root)
+        for line in proc.stdout.splitlines():
+            line = line.strip()
+            if not line:
+                continue
+            changed_files.add((repo_root / line).resolve())
+    return changed_files
+
+
+def parse_changed_lines_from_diff_text(
+    diff_text: str,
+    repo_root: Path,
+) -> dict[Path, set[int]]:
+    changed_lines: dict[Path, set[int]] = defaultdict(set)
+
+    current_file: Path | None = None
+    in_hunk = False
+    new_line = 0
+
+    for raw_line in diff_text.splitlines():
+        if raw_line.startswith("+++ "):
+            file_token = raw_line[4:].strip()
+            if file_token == "/dev/null":
+                current_file = None
+                in_hunk = False
+                continue
+            if file_token.startswith("b/"):
+                file_token = file_token[2:]
+            current_file = (repo_root / file_token).resolve()
+            in_hunk = False
+            continue
+
+        hunk_match = DIFF_HUNK_RE.match(raw_line)
+        if hunk_match:
+            in_hunk = current_file is not None
+            new_line = int(hunk_match.group(1))
+            continue
+
+        if not in_hunk or current_file is None:
+            continue
+
+        if raw_line.startswith("+") and not raw_line.startswith("+++"):
+            changed_lines[current_file].add(new_line)
+            new_line += 1
+            continue
+
+        if raw_line.startswith("-") and not raw_line.startswith("---"):
+            continue
+
+        if raw_line.startswith(" "):
+            new_line += 1
+            continue
+
+    return changed_lines
+
+
+def git_changed_line_map(
+    repo_root: Path,
+    baseline: str,
+    head: str,
+    include_working_tree: bool,
+) -> dict[Path, set[int]]:
+    changed_lines: dict[Path, set[int]] = defaultdict(set)
+
+    proc = run_command(
+        ["git", "diff", "--unified=0", f"{baseline}...{head}"],
+        cwd=repo_root,
+    )
+    baseline_map = parse_changed_lines_from_diff_text(proc.stdout, repo_root)
+    for path, lines in baseline_map.items():
+        changed_lines[path].update(lines)
+
+    if include_working_tree:
+        for cmd in (
+            ["git", "diff", "--unified=0"],
+            ["git", "diff", "--cached", "--unified=0"],
+        ):
+            wt_proc = run_command(cmd, cwd=repo_root)
+            wt_map = parse_changed_lines_from_diff_text(wt_proc.stdout, repo_root)
+            for path, lines in wt_map.items():
+                changed_lines[path].update(lines)
+
+    return changed_lines
+
+
+def extract_changed_symbol_names_from_file(
+    file_path: Path,
+    changed_lines: set[int],
+) -> set[str]:
+    if not changed_lines or not file_path.exists():
+        return set()
+
+    lines = file_path.read_text(encoding="utf-8", errors="replace").splitlines()
+    symbols: set[str] = set()
+
+    line_index = 1
+    max_line = len(lines)
+    while line_index <= max_line:
+        line = lines[line_index - 1]
+        match = CPP_FUNCTION_START_RE.match(line)
+        if not match:
+            line_index += 1
+            continue
+
+        symbol_name = match.group(1)
+        start_line = line_index
+        brace_depth = line.count("{") - line.count("}")
+        end_line = start_line
+
+        while brace_depth > 0 and end_line < max_line:
+            end_line += 1
+            body_line = lines[end_line - 1]
+            brace_depth += body_line.count("{") - body_line.count("}")
+
+        if any(start_line <= line_no <= end_line for line_no in changed_lines):
+            symbols.add(symbol_name)
+
+        line_index = end_line + 1
+
+    return symbols
+
+
+def collect_changed_symbol_names(
+    changed_line_map: dict[Path, set[int]],
+) -> set[str]:
+    symbol_names: set[str] = set()
+    for file_path, changed_lines in changed_line_map.items():
+        symbol_names.update(
+            extract_changed_symbol_names_from_file(file_path, changed_lines)
+        )
+    return symbol_names
+
+
+def clean_command_for_dependency_scan(arguments: list[str]) -> list[str]:
+    cleaned: list[str] = []
+    skip_next = False
+    flags_with_value = {
+        "-o",
+        "-MF",
+        "-MT",
+        "-MQ",
+        "-MJ",
+        "-Xclang",
+    }
+    standalone_drop = {
+        "-c",
+        "-MD",
+        "-MMD",
+        "-MP",
+        "-MM",
+        "-M",
+        "-E",
+        "-S",
+    }
+
+    index = 0
+    while index < len(arguments):
+        arg = arguments[index]
+        if skip_next:
+            skip_next = False
+            index += 1
+            continue
+
+        if arg in flags_with_value:
+            skip_next = True
+            index += 1
+            continue
+        if arg in standalone_drop:
+            index += 1
+            continue
+        if arg.startswith("-o") and len(arg) > 2:
+            index += 1
+            continue
+        if arg.startswith("-MF") and len(arg) > 3:
+            index += 1
+            continue
+        if arg.startswith("-MT") and len(arg) > 3:
+            index += 1
+            continue
+        if arg.startswith("-MQ") and len(arg) > 3:
+            index += 1
+            continue
+        if arg.startswith("-MJ") and len(arg) > 3:
+            index += 1
+            continue
+
+        cleaned.append(arg)
+        index += 1
+
+    return cleaned
+
+
+def parse_makefile_dependencies(stdout_text: str) -> list[str]:
+    flattened = stdout_text.replace("\\\n", " ").replace("\n", " ")
+    if ":" not in flattened:
+        return []
+    dep_payload = flattened.split(":", 1)[1].strip()
+    if not dep_payload:
+        return []
+    return shlex.split(dep_payload)
+
+
+def compute_tu_dependencies(entry: CompileCommandEntry) -> None:
+    dep_cmd = clean_command_for_dependency_scan(entry.arguments)
+    if not dep_cmd:
+        entry.dep_error = "Empty compile command after sanitization"
+        entry.dependencies = {entry.source}
+        return
+
+    dep_cmd.extend(["-MM", "-MF", "-", "-MT", "__pixie_tu__"])
+    source_arg = str(entry.source)
+    if source_arg not in dep_cmd:
+        dep_cmd.append(source_arg)
+
+    try:
+        proc = run_command(dep_cmd, cwd=entry.directory, check=False)
+    except FileNotFoundError as exc:
+        entry.dep_error = str(exc)
+        entry.dependencies = {entry.source}
+        return
+
+    dependencies: set[Path] = {entry.source}
+    if proc.returncode != 0:
+        stderr = proc.stderr.strip()
+        entry.dep_error = (
+            stderr if stderr else f"Dependency scan failed ({proc.returncode})"
+        )
+        entry.dependencies = dependencies
+        return
+
+    for dep in parse_makefile_dependencies(proc.stdout):
+        dep_path = Path(dep)
+        resolved = (
+            dep_path.resolve()
+            if dep_path.is_absolute()
+            else (entry.directory / dep_path).resolve()
+        )
+        dependencies.add(resolved)
+
+    entry.dependencies = dependencies
+
+
+def is_build_infra_change(repo_root: Path, changed: set[Path]) -> bool:
+    for path in changed:
+        if path.name in BUILD_INFRA_FILES:
+            return True
+        try:
+            rel = path.relative_to(repo_root)
+        except ValueError:
+            continue
+        rel_text = rel.as_posix()
+        if rel_text.startswith("cmake/"):
+            return True
+    return False
+
+
+def identify_benchmark_targets(
+    entries: list[CompileCommandEntry], repo_root: Path
+) -> set[str]:
+    benchmark_targets: set[str] = set()
+    for entry in entries:
+        if entry.target is None:
+            continue
+        try:
+            rel = entry.source.relative_to(repo_root)
+            rel_text = rel.as_posix()
+        except ValueError:
+            rel_text = entry.source.as_posix()
+
+        if rel_text.startswith("src/benchmarks/"):
+            benchmark_targets.add(entry.target)
+
+    benchmark_targets.update(KNOWN_BENCHMARK_TARGETS)
+    return benchmark_targets
+
+
+def discover_clangxx(explicit: str | None) -> str:
+    if explicit:
+        return explicit
+
+    candidates = [
+        "clang++",
+        "clang++-19",
+        "clang++-18",
+        "clang++-17",
+        "clang++-16",
+    ]
+    for candidate in candidates:
+        resolved = shutil.which(candidate)
+        if resolved:
+            return resolved
+    raise FileNotFoundError(
+        "clang++ was not found on PATH. Provide --clangxx to select a clang compiler."
+    )
+
+
+def clean_command_for_ast(arguments: list[str], clangxx: str) -> list[str]:
+    cleaned = clean_command_for_dependency_scan(arguments)
+    if not cleaned:
+        return []
+    cleaned[0] = clangxx
+    cleaned.extend(["-Xclang", "-ast-dump=json", "-fsyntax-only"])
+    return cleaned
+
+
+def normalize_path_candidate(path_text: str | None, working_dir: Path) -> Path | None:
+    if not path_text:
+        return None
+    path = Path(path_text)
+    if path.is_absolute():
+        return path.resolve()
+    return (working_dir / path).resolve()
+
+
+def file_from_loc(loc: dict[str, Any] | None, working_dir: Path) -> Path | None:
+    if not isinstance(loc, dict):
+        return None
+    if "file" in loc:
+        return normalize_path_candidate(str(loc["file"]), working_dir)
+    for nested_key in ("spellingLoc", "expansionLoc", "includedFrom"):
+        nested_loc = loc.get(nested_key)
+        if isinstance(nested_loc, dict):
+            resolved = file_from_loc(nested_loc, working_dir)
+            if resolved is not None:
+                return resolved
+    return None
+
+
+def iter_ast_nodes(node: Any):
+    if isinstance(node, dict):
+        yield node
+        inner = node.get("inner", [])
+        if isinstance(inner, list):
+            for child in inner:
+                yield from iter_ast_nodes(child)
+    elif isinstance(node, list):
+        for item in node:
+            yield from iter_ast_nodes(item)
+
+
+def referenced_decl_file(node: dict[str, Any], working_dir: Path) -> Path | None:
+    referenced = node.get("referencedDecl")
+    if not isinstance(referenced, dict):
+        return None
+    return file_from_loc(referenced.get("loc"), working_dir)
+
+
+def node_references_changed_symbol(
+    node: dict[str, Any],
+    changed_symbol_names: set[str],
+) -> bool:
+    if not changed_symbol_names:
+        return False
+
+    for subnode in iter_ast_nodes(node):
+        if not isinstance(subnode, dict):
+            continue
+
+        kind = subnode.get("kind")
+        if kind == "MemberExpr":
+            member_name = subnode.get("name")
+            if isinstance(member_name, str) and member_name in changed_symbol_names:
+                return True
+
+        if kind == "DeclRefExpr":
+            ref_decl = subnode.get("referencedDecl")
+            if not isinstance(ref_decl, dict):
+                continue
+            ref_name = ref_decl.get("name")
+            if isinstance(ref_name, str) and ref_name in changed_symbol_names:
+                return True
+
+    return False
+
+
+def call_expr_callee_name(call_expr: dict[str, Any]) -> str | None:
+    for node in iter_ast_nodes(call_expr):
+        if not isinstance(node, dict):
+            continue
+        if node.get("kind") != "DeclRefExpr":
+            continue
+        referenced = node.get("referencedDecl")
+        if isinstance(referenced, dict) and isinstance(referenced.get("name"), str):
+            return referenced["name"]
+    return None
+
+
+def string_literals_in_node(node: dict[str, Any]) -> list[str]:
+    values: list[str] = []
+    for cur in iter_ast_nodes(node):
+        if not isinstance(cur, dict):
+            continue
+        if cur.get("kind") != "StringLiteral":
+            continue
+        value = cur.get("value")
+        if isinstance(value, str):
+            if len(value) >= 2 and value[0] == '"' and value[-1] == '"':
+                value = value[1:-1]
+            values.append(value)
+    return values
+
+
+def benchmark_names_from_source(source: Path) -> set[str]:
+    names: set[str] = set()
+    if not source.exists():
+        return names
+    text = source.read_text(encoding="utf-8", errors="replace")
+    for match in re.finditer(r"BENCHMARK\(\s*([A-Za-z_][A-Za-z0-9_]*)\s*\)", text):
+        names.add(match.group(1))
+    for match in re.finditer(r"register_op\(\s*\"([^\"]+)\"", text):
+        names.add(match.group(1))
+    return names
+
+
+def ast_analyze_entry(
+    entry: CompileCommandEntry,
+    changed_files: set[Path],
+    changed_symbol_names: set[str],
+    clangxx: str,
+) -> AstImpactResult:
+    result = AstImpactResult()
+
+    ast_cmd = clean_command_for_ast(entry.arguments, clangxx)
+    if not ast_cmd:
+        result.ast_error = "Failed to build AST command"
+        return result
+
+    try:
+        proc = run_command(ast_cmd, cwd=entry.directory, check=False)
+    except FileNotFoundError as exc:
+        result.ast_error = str(exc)
+        return result
+
+    if proc.returncode != 0:
+        stderr = proc.stderr.strip()
+        result.ast_error = (
+            stderr if stderr else f"AST command failed ({proc.returncode})"
+        )
+        return result
+
+    try:
+        ast_root = json.loads(proc.stdout)
+    except json.JSONDecodeError as exc:
+        result.ast_error = f"Invalid AST JSON: {exc}"
+        return result
+
+    for node in iter_ast_nodes(ast_root):
+        if not isinstance(node, dict):
+            continue
+
+        kind = node.get("kind")
+        if kind in {"FunctionDecl", "CXXMethodDecl"}:
+            name = node.get("name")
+            if isinstance(name, str) and name.startswith("BM_"):
+                result.benchmark_names.add(name)
+                function_loc = file_from_loc(node.get("loc"), entry.directory)
+                if function_loc in changed_files:
+                    result.affected_names.add(name)
+                    continue
+
+                if node_references_changed_symbol(node, changed_symbol_names):
+                    result.affected_names.add(name)
+                    continue
+
+                for subnode in iter_ast_nodes(node):
+                    if not isinstance(subnode, dict):
+                        continue
+                    ref_file = referenced_decl_file(subnode, entry.directory)
+                    if ref_file in changed_files:
+                        result.affected_names.add(name)
+                        break
+
+        if kind == "CallExpr":
+            callee = call_expr_callee_name(node)
+            if callee != "register_op":
+                continue
+            literal_values = string_literals_in_node(node)
+            if not literal_values:
+                continue
+            benchmark_name = literal_values[0]
+            result.benchmark_names.add(benchmark_name)
+
+            call_loc = file_from_loc(node.get("loc"), entry.directory)
+            if call_loc in changed_files:
+                result.affected_names.add(benchmark_name)
+                continue
+
+            if node_references_changed_symbol(node, changed_symbol_names):
+                result.affected_names.add(benchmark_name)
+                continue
+
+            for subnode in iter_ast_nodes(node):
+                if not isinstance(subnode, dict):
+                    continue
+                ref_file = referenced_decl_file(subnode, entry.directory)
+                if ref_file in changed_files:
+                    result.affected_names.add(benchmark_name)
+                    break
+
+    return result
+
+
+def regex_for_benchmarks(names: set[str]) -> str | None:
+    if not names:
+        return None
+    ordered = sorted(names)
+    body = "|".join(re.escape(name) for name in ordered)
+    return rf"^({body})(/|$)"
+
+
+def relpath_or_abs(path: Path, root: Path) -> str:
+    try:
+        return path.relative_to(root).as_posix()
+    except ValueError:
+        return path.as_posix()
+
+
+def main() -> int:
+    cli = parse_args()
+
+    try:
+        repo_root = git_repo_root()
+        changed_files = git_changed_files(repo_root, cli.baseline, cli.head)
+        if cli.include_working_tree:
+            changed_files.update(git_working_tree_changed_files(repo_root))
+        changed_line_map = git_changed_line_map(
+            repo_root,
+            cli.baseline,
+            cli.head,
+            cli.include_working_tree,
+        )
+        changed_symbol_names = collect_changed_symbol_names(changed_line_map)
+        compile_commands_path = resolve_compile_commands(
+            repo_root, cli.compile_commands
+        )
+        entries = load_compile_commands(compile_commands_path)
+    except FileNotFoundError as exc:
+        print(f"error: {exc}", file=sys.stderr)
+        return 2
+    except subprocess.CalledProcessError as exc:
+        stderr = (exc.stderr or "").strip()
+        if stderr:
+            print(f"error: {stderr}", file=sys.stderr)
+        else:
+            print(f"error: command failed: {' '.join(exc.cmd)}", file=sys.stderr)
+        return 2
+
+    target_to_entries: dict[str, list[CompileCommandEntry]] = defaultdict(list)
+    for entry in entries:
+        if entry.target:
+            target_to_entries[entry.target].append(entry)
+
+    for entry in entries:
+        compute_tu_dependencies(entry)
+
+    affected_targets: set[str] = set()
+    for entry in entries:
+        has_changed_dependency = any(dep in changed_files for dep in entry.dependencies)
+        if has_changed_dependency:
+            if entry.target:
+                affected_targets.add(entry.target)
+
+    benchmark_targets = identify_benchmark_targets(entries, repo_root)
+    all_targets = {entry.target for entry in entries if entry.target}
+
+    infra_change = is_build_infra_change(repo_root, changed_files)
+    if infra_change:
+        affected_targets.update(all_targets)
+
+    affected_benchmark_targets = sorted(
+        affected_targets.intersection(benchmark_targets)
+    )
+
+    benchmark_entries: list[CompileCommandEntry] = []
+    for entry in entries:
+        if entry.target in benchmark_targets:
+            benchmark_entries.append(entry)
+
+    impacted_benchmark_entries: list[CompileCommandEntry] = []
+    for entry in benchmark_entries:
+        has_changed_dependency = any(dep in changed_files for dep in entry.dependencies)
+        if has_changed_dependency or entry.target in affected_benchmark_targets:
+            impacted_benchmark_entries.append(entry)
+
+    ast_errors: dict[str, str] = {}
+    benchmark_target_to_names: dict[str, set[str]] = defaultdict(set)
+    benchmark_target_to_affected: dict[str, set[str]] = defaultdict(set)
+    warnings: list[str] = []
+
+    if impacted_benchmark_entries:
+        try:
+            clangxx = discover_clangxx(cli.clangxx)
+        except FileNotFoundError as exc:
+            clangxx = ""
+            warnings.append(str(exc))
+
+        for entry in impacted_benchmark_entries:
+            target_name = entry.target or "<unknown-target>"
+
+            if not clangxx:
+                fallback_names = benchmark_names_from_source(entry.source)
+                benchmark_target_to_names[target_name].update(fallback_names)
+                if changed_symbol_names:
+                    benchmark_target_to_affected[target_name].update(fallback_names)
+                continue
+
+            ast_result = ast_analyze_entry(
+                entry,
+                changed_files,
+                changed_symbol_names,
+                clangxx,
+            )
+            if ast_result.ast_error:
+                ast_errors[relpath_or_abs(entry.source, repo_root)] = (
+                    ast_result.ast_error
+                )
+
+            if ast_result.benchmark_names:
+                benchmark_target_to_names[target_name].update(
+                    ast_result.benchmark_names
+                )
+            else:
+                benchmark_target_to_names[target_name].update(
+                    benchmark_names_from_source(entry.source)
+                )
+
+            if ast_result.affected_names:
+                benchmark_target_to_affected[target_name].update(
+                    ast_result.affected_names
+                )
+            else:
+                source_is_changed = entry.source in changed_files
+                if source_is_changed:
+                    benchmark_target_to_affected[target_name].update(
+                        benchmark_target_to_names[target_name]
+                    )
+
+    if infra_change and affected_benchmark_targets:
+        for target_name in affected_benchmark_targets:
+            for entry in target_to_entries.get(target_name, []):
+                benchmark_target_to_names[target_name].update(
+                    benchmark_names_from_source(entry.source)
+                )
+                benchmark_target_to_affected[target_name].update(
+                    benchmark_names_from_source(entry.source)
+                )
+
+    all_affected_benchmarks: set[str] = set()
+    for names in benchmark_target_to_affected.values():
+        all_affected_benchmarks.update(names)
+
+    dep_scan_failures = {
+        relpath_or_abs(entry.source, repo_root): entry.dep_error
+        for entry in entries
+        if entry.dep_error
+    }
+
+    report: dict[str, Any] = {
+        "baseline": cli.baseline,
+        "head": cli.head,
+        "include_working_tree": cli.include_working_tree,
+        "changed_symbols": sorted(changed_symbol_names),
+        "compile_commands": relpath_or_abs(compile_commands_path, repo_root),
+        "changed_files": sorted(
+            relpath_or_abs(path, repo_root) for path in changed_files
+        ),
+        "affected_targets": sorted(affected_targets),
+        "affected_benchmark_targets": affected_benchmark_targets,
+        "affected_benchmarks": {
+            target: sorted(names)
+            for target, names in sorted(benchmark_target_to_affected.items())
+            if names
+        },
+        "suggested_filter_regex": regex_for_benchmarks(all_affected_benchmarks),
+        "dependency_scan_failures": dep_scan_failures,
+        "ast_failures": ast_errors,
+        "warnings": warnings,
+    }
+
+    if cli.format == "json":
+        json.dump(report, sys.stdout, indent=2)
+        sys.stdout.write("\n")
+        return 0
+
+    print(f"Baseline: {cli.baseline}")
+    print(f"Head: {cli.head}")
+    print(f"Compile commands: {report['compile_commands']}")
+    print("")
+
+    print(f"Changed files ({len(report['changed_files'])}):")
+    for item in report["changed_files"]:
+        print(f"- {item}")
+    if not report["changed_files"]:
+        print("- none")
+
+    print("")
+    print(f"Affected targets ({len(report['affected_targets'])}):")
+    for item in report["affected_targets"]:
+        print(f"- {item}")
+    if not report["affected_targets"]:
+        print("- none")
+
+    print("")
+    print(f"Affected benchmark targets ({len(report['affected_benchmark_targets'])}):")
+    for item in report["affected_benchmark_targets"]:
+        print(f"- {item}")
+    if not report["affected_benchmark_targets"]:
+        print("- none")
+
+    print("")
+    print("Affected benchmark functions:")
+    if report["affected_benchmarks"]:
+        for target, names in report["affected_benchmarks"].items():
+            print(f"- {target}:")
+            for name in names:
+                print(f"  - {name}")
+    else:
+        print("- none")
+
+    print("")
+    print("Suggested --benchmark_filter regex:")
+    print(report["suggested_filter_regex"] or "none")
+
+    if dep_scan_failures:
+        print("")
+        print("Dependency scan failures:")
+        for source, error in dep_scan_failures.items():
+            print(f"- {source}: {error}")
+
+    if ast_errors:
+        print("")
+        print("AST failures:")
+        for source, error in ast_errors.items():
+            print(f"- {source}: {error}")
+
+    if warnings:
+        print("")
+        print("Warnings:")
+        for warning in warnings:
+            print(f"- {warning}")
+
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/.kilo/skills/benchmarks-compare-revisions/SKILL.md b/.kilo/skills/benchmarks-compare-revisions/SKILL.md
index 23996a5..49b0d6e 100644
--- a/.kilo/skills/benchmarks-compare-revisions/SKILL.md
+++ b/.kilo/skills/benchmarks-compare-revisions/SKILL.md
@@ -5,7 +5,12 @@ description: Compare benchmark performance between two git revisions by building
 
 # Benchmarks Compare Revisions Skill
 
-Use this skill to compare performance between two git revisions. It focuses on the compare workflow and relies on the existing benchmarks skill for build/run details (see .kilo/skills/benchmarks/SKILL.md).
+Use this skill to compare performance between two git revisions.
+
+This workflow now depends on:
+
+1. `.kilo/skills/benchmarks-affected/SKILL.md` to determine affected benchmark targets/functions and produce a benchmark filter.
+2. `.kilo/skills/benchmarks/SKILL.md` for build/run operational details.
 
 ## Goal
 
@@ -21,7 +26,27 @@ BASELINE=abc1234
 CONTENDER=def5678
 ```
 
-## Step 1 — Build both revisions (Release only)
+## Step 1 — Compute affected benchmark scope first
+
+Run `benchmarks-affected` from the contender checkout to derive the compare scope.
+
+Do not duplicate `benchmarks-affected` internals here (compile database selection, AST analysis, or fallback heuristics). Follow that skill directly and consume only its outputs.
+
+Inputs to pass through:
+
+- `--baseline ${BASELINE}`
+- optional compile-commands path if auto-detection is not desired
+- optional output format (`json` recommended for parsing)
+
+Consume these outputs from `benchmarks-affected`:
+
+- `suggested_filter_regex` -> set `FILTER`
+- `affected_benchmark_targets` -> optionally constrain which benchmark binary/binaries to run
+- `affected_benchmarks` -> function-level scope for validation/reporting
+
+If `FILTER` is empty, fall back to full benchmark binary compare (conservative mode).
+
+## Step 2 — Build both revisions (Release only)
 
 Use the existing benchmarks skill build steps, but set the build suffix to include the short hash for each revision:
 
@@ -37,7 +62,7 @@ git checkout ${CONTENDER}
 # Follow .kilo/skills/benchmarks/SKILL.md build instructions with this suffix
 ```
 
-## Step 2 — Compare using compare.py
+## Step 3 — Compare using compare.py
 
 Use Google Benchmark compare tooling with a JSON-first flow to avoid long-running binary-vs-binary retries.
 
@@ -80,13 +105,16 @@ Run the comparison:
 python3 ${COMPARE_PY} -a benchmarks ${BASE_JSON} ${CONT_JSON}
 ```
 
-### Optional: filter to reduce noise and runtime
-
-Pass filter when generating JSON files:
+Use the affected filter from Step 1 when generating JSON files:
 ```bash
-FILTER="BM_Rank"
-build/benchmarks-all_bench_${BASELINE}/benchmarks --benchmark_filter="${FILTER}" --benchmark_report_aggregates_only=true --benchmark_display_aggregates_only=true ...
-build/benchmarks-all_bench_${CONTENDER}/benchmarks --benchmark_filter="${FILTER}" --benchmark_report_aggregates_only=true --benchmark_display_aggregates_only=true ...
+if [ -n "${FILTER}" ]; then
+  FILTER_ARG="--benchmark_filter=${FILTER}"
+else
+  FILTER_ARG=""
+fi
+
+build/benchmarks-all_bench_${BASELINE}/benchmarks ${FILTER_ARG} --benchmark_report_aggregates_only=true --benchmark_display_aggregates_only=true ...
+build/benchmarks-all_bench_${CONTENDER}/benchmarks ${FILTER_ARG} --benchmark_report_aggregates_only=true --benchmark_display_aggregates_only=true ...
 ```
 
 ## Retry and Timeout Policy
@@ -96,7 +124,7 @@ build/benchmarks-all_bench_${CONTENDER}/benchmarks --benchmark_filter="${FILTER}
 3. Maximum retries per benchmark group: 1.
 4. If still failing, emit blocked/partial findings instead of repeated attempts.
 
-## Step 3 — Record findings
+## Step 4 — Record findings
 
 Capture the compare.py output (terminal transcript or redirected file) and note any statistically significant regressions or wins.
 
@@ -105,5 +133,5 @@ Capture the compare.py output (terminal transcript or redirected file) and note
 1. **Release only**: never compare Debug binaries.
 2. **Short hash suffixes**: keep build dirs isolated per revision (example: `bench_<short-hash>`).
 3. **Same host, same conditions**: do not compare across different machines or power profiles.
-4. **Filter for focus**: narrow `--benchmark_filter` to the changed area.
+4. **Filter from analysis**: use `benchmarks-affected` output instead of hand-crafted filters whenever possible.
 5. **Pin frequency**: for stable numbers, follow benchmark skill guidance on CPU governor.