Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions bootstrap.md
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,14 @@ If no naming mechanism is available, skip session naming.
for a code review of C code, suggest adding the `memory-safety-c` protocol.
- **Suggest taxonomies** when the task involves classification. For example,
if investigating stack corruption, suggest the `stack-lifetime-hazards` taxonomy.
- **Evaluate taxonomy relevance** before including template-declared
taxonomies. If a template declares a default taxonomy that is clearly
irrelevant to the user's specific investigation (e.g., `stack-lifetime-hazards`
for a power trace analysis, or a CWE taxonomy for a non-security task),
ask the user whether to include it. Omit irrelevant taxonomies rather
than wasting context window on classification schemes that do not apply.
When omitting a template-declared taxonomy, note the omission briefly
so the user understands the deviation from defaults.
- **Ask for the audit domain** when the selected template is
`investigate-security`, `review-code`, `review-cpp-code`, or
`exhaustive-bug-hunt`. The library includes CWE-derived per-domain
Expand Down
5 changes: 5 additions & 0 deletions formats/investigation-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ Before writing the report, **enumerate and classify all findings first**

If the invoking template or workflow explicitly requires the full
9-section structure, use the full format regardless of finding count.
Templates that include a "use the full investigation report format"
instruction (e.g., `investigate-bug`, `investigate-trace`) always
require the full format because the causal chain, prevention, and open
questions sections contain the most actionable content for root cause
investigations.

## Abbreviated Format

Expand Down
12 changes: 12 additions & 0 deletions manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1333,6 +1333,18 @@ templates:
taxonomies: [stack-lifetime-hazards]
format: investigation-report

- name: investigate-trace
path: templates/investigate-trace.md
description: >
Investigate a performance, power, or behavioral issue using
profiling traces, ETW/ETL captures, or telemetry data. Apply
root cause analysis with iterative deepening, call stack
analysis, energy-vs-metric divergence, and cross-process
amplification detection.
persona: systems-engineer
protocols: [anti-hallucination, self-verification, operational-constraints, root-cause-analysis]
format: investigation-report

- name: find-and-fix-bugs
path: templates/find-and-fix-bugs.md
description: >
Expand Down
9 changes: 9 additions & 0 deletions protocols/guardrails/anti-hallucination.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,15 @@ Every claim in your output MUST be categorized as one of:
- **ASSUMED**: Not established by context. The assumption MUST be flagged
with `[ASSUMPTION]` and a justification for why it is reasonable.

**Data-driven tasks**: When the source data is authoritative machine
telemetry or tool output (e.g., profiler results, trace queries, compiler
diagnostics, monitoring metrics), direct observations and measurements
reported by the tool have implicit KNOWN status and do not require explicit
`[KNOWN]` labels. However, **causal explanations**, **inferred
correlations**, and **interpretations** of that data retain full labeling
requirements — these are INFERRED or ASSUMED claims even when derived
from authoritative measurements.
Comment thread
Alan-Jowett marked this conversation as resolved.

When the number of claims categorized as ASSUMED exceeds 30% of the total
number of categorized claims in your output, stop and request
additional context instead of proceeding.
Expand Down
11 changes: 11 additions & 0 deletions protocols/guardrails/operational-constraints.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,12 @@ creep, non-reproducible analysis, and context window exhaustion.
exhaustive or comprehensive review, you may exceed 50 files but only
in batches of at most 50 files, with a summary after each batch
before continuing.
- **For trace, telemetry, or log analysis**: the equivalent scoping
constraint is data categories and time ranges, not file counts. Before
querying, identify which data categories (e.g., CPU sampling, disk I/O,
energy estimation, network activity) and which time ranges are relevant.
Do NOT process all available categories or the full trace duration
without first establishing which subset matters.
- Before reading code or data, establish your **search strategy**:
- What directories, files, or patterns are likely relevant?
- What naming conventions, keywords, or symbols should guide search?
Expand Down Expand Up @@ -64,6 +70,11 @@ Use a funnel approach:
- Summarize intermediate findings as you go.
- Prefer reading specific functions over entire files.
- Use search tools (grep, find, symbol lookup) before reading files.
- **For structured data sources** (trace queries, database results, API
responses): limit query result volume to what is needed for the current
analysis layer. Retrieve summary/aggregated data first, then drill into
detail only for top contributors. Do NOT retrieve full detail for all
items in a single query.

### 5. Tool Usage Discipline

Expand Down
52 changes: 52 additions & 0 deletions protocols/reasoning/root-cause-analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ description: >
and elimination. Language-agnostic.
applicable_to:
- investigate-bug
- investigate-trace
- root-cause-ci-failure
---

Expand Down Expand Up @@ -62,6 +63,34 @@ For each hypothesis, starting with the most plausible:
- **ELIMINATED**: Evidence directly contradicts it.
- **INCONCLUSIVE**: Evidence is insufficient; state what is needed.

## Phase 3a: Iterative Deepening

Investigation MUST proceed in layers of increasing resolution. Each layer
Comment thread
Alan-Jowett marked this conversation as resolved.
informs the next — do NOT skip layers or jump directly to deep analysis.

1. **Broad survey**: Identify top contributors at the coarsest granularity
(e.g., by process, module, subsystem, or component). Rank by impact.
2. **Attribution**: For the top 5–10 contributors, break down by the next
level of detail (e.g., by module within a process, by function within
a module, by allocation site within a function).
3. **Deep analysis**: For the top contributors at the attribution level,
obtain the most detailed evidence available (e.g., call stacks, data
flow traces, lock contention chains, allocation histories). Call stacks
and execution traces reveal *why* something is happening — module-level
data only reveals *where*.
4. **Cross-component tracing**: Identify causal chains that span component
or process boundaries (see Phase 4a).

Do NOT write the final report until layer 3 is complete for the top
contributors, up to 5, using the most detailed evidence available.
If fewer than 5 contributors exist, analyze all of them. If available
evidence does not support layer-3 completion for some contributors, you
MAY proceed to the final report only if you explicitly document the
limitation, identify which contributors remain inconclusive, and state
what additional evidence would be needed. Premature reporting without
this disclosure produces surface-level findings that miss the actual
root cause.

## Phase 4: Root Cause Identification

1. Distinguish between the **root cause** (fundamental defect) and the
Expand All @@ -73,6 +102,29 @@ For each hypothesis, starting with the most plausible:
3. Ask: "If we fix only the proximate cause, will the root cause
produce other failures?" If yes, the fix is incomplete.

## Phase 4a: Cross-Component Causal Chains

When the investigation involves multiple components, processes, or
subsystems, trace causal chains across boundaries:

1. **Identify trigger-response pairs**: Does activity in component A
cause work in component B? For example, a file write by one process
may trigger scanning by an antivirus service, which triggers hashing
by an EDR agent, which triggers network inspection by another service.
2. **Map the amplification cascade**: A single action may fan out into
disproportionate downstream work. Document the full chain:
`Trigger → Reactor₁ → Reactor₂ → ... → Observed symptom`.
Comment thread
Alan-Jowett marked this conversation as resolved.
3. **Quantify amplification**: For each link in the chain, estimate the
cost ratio (e.g., "1 file write triggers 3 scan operations, each
consuming 50ms of CPU"). The amplification factor often explains why
a seemingly minor activity produces outsized impact.
4. **Identify the leverage point**: The most effective fix targets the
link in the chain with the highest amplification factor, not
necessarily the initial trigger or the final symptom.

Skip this phase when the investigation is confined to a single component
with no cross-boundary interactions.

## Phase 5: Remediation

1. Propose a fix for the **root cause**, not just the symptom.
Expand Down
4 changes: 4 additions & 0 deletions templates/investigate-bug.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,10 @@ and producing a structured investigation report.
- Identify tests that would have caught this bug
- Suggest defensive measures to prevent recurrence

8. **Use the full investigation report format**. Root cause investigation
requires the causal chain, prevention, and open questions sections — do
not use the abbreviated format.

## Non-Goals

Explicitly define what is OUT OF SCOPE for this investigation.
Expand Down
218 changes: 218 additions & 0 deletions templates/investigate-trace.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
<!-- SPDX-License-Identifier: MIT -->
<!-- Copyright (c) PromptKit Contributors -->

---
name: investigate-trace
description: >
Systematically investigate a performance, power, or behavioral issue
using profiling traces, ETW/ETL captures, or telemetry data. Apply
root cause analysis with iterative deepening and produce an
investigation report.
persona: systems-engineer
protocols:
- guardrails/anti-hallucination
- guardrails/self-verification
- guardrails/operational-constraints
- reasoning/root-cause-analysis
format: investigation-report
params:
problem_description: "Natural language description of the issue under investigation"
trace_context: "Trace capture method, providers/profiles used, and analysis tool capabilities"
environment: "OS, hardware, workload scenario, and capture conditions"
input_contract: null
output_contract:
type: investigation-report
description: >
A structured investigation report with findings, root cause analysis,
evidence from trace data, and remediation plan.
---

# Task: Investigate Trace

You are tasked with investigating a performance, power, or behavioral issue
using profiling trace data and producing a structured investigation report.

## Inputs

**Problem Description**:
{{problem_description}}

**Trace / Telemetry Context**:
{{trace_context}}

**Environment**:
{{environment}}

## Instructions

1. **Apply the root-cause-analysis protocol** systematically:
- Characterize the symptom precisely
- Generate 3–5 competing hypotheses before investigating any
- Evaluate evidence for each hypothesis
- Apply iterative deepening (Phase 3a): broad survey → attribution →
deep analysis → cross-component tracing
- Apply cross-component causal chain analysis (Phase 4a) when
multiple processes or components are involved
- Identify the root cause, not just the proximate trigger

2. **Apply the anti-hallucination protocol** throughout:
- Base analysis ONLY on the provided trace data and context
- Direct observations from trace queries (metrics, measurements,
counters) have implicit KNOWN status
- Causal explanations and correlations MUST be explicitly labeled
as INFERRED or [ASSUMPTION]
- If you cannot determine the root cause from the available data,
say so and describe exactly what additional traces or data
categories are needed
- Do NOT fabricate process names, PIDs, metric values, or trace
events that are not evidenced in the provided data

3. **Format the output** according to the investigation-report format
specification. **Use the full investigation report format** (all
sections). Root cause investigation requires the causal chain,
prevention, and open questions sections — do not use the abbreviated
format.

4. **Call stack analysis is primary** — not optional:
- For each top contributor identified in the broad survey, obtain
call stacks grouped by process and thread
- Identify the dominant call chains — these reveal the actual
workload (e.g., file scanning vs. idle polling vs. network
inspection vs. background sync)
- Module-level attribution only tells you *where* — call stacks
tell you *why*. Do NOT stop at module-level attribution.
- When call stacks are unavailable, state this as a limitation and
describe what the stacks would have revealed

5. **Energy-vs-metric divergence analysis**:
- Compare each process's CPU sample percentage against its energy
estimation percentage (or equivalent resource metric)
- Processes with disproportionately high energy relative to CPU
time indicate frequent wake/sleep patterns that prevent deep
idle states — these are often worse for battery life than
processes with high sustained CPU
- Flag any process where the energy-to-CPU ratio exceeds 3:1 as
a high-priority finding. When CPU% is below 1%, do not rely on
the ratio alone — only elevate to high priority when energy%
is also significant (≥ 3%); otherwise note it as a
low-confidence anomaly

6. **Cross-process amplification analysis**:
- Analyze whether background processes amplify each other's impact
- A file write by Process A may trigger scans by Process B,
hashing by Process C, and network inspection by Process D
- Trace these causal chains across process boundaries
- Document the full amplification cascade:
`Trigger → Reactor₁ → Reactor₂ → ... → Observed symptom`
Comment thread
Alan-Jowett marked this conversation as resolved.
- This "amplification cascade" is often the true root cause of
death-by-a-thousand-cuts performance or power drain

7. **Apply the self-verification protocol** before finalizing:
- Sample at least 3–5 specific findings and re-verify against
the trace data
- Ensure every causal claim is labeled INFERRED or [ASSUMPTION]
- Confirm coverage: state what data categories were examined and
Comment thread
Alan-Jowett marked this conversation as resolved.
what was not

8. **Apply the operational-constraints protocol** when working with
the trace:
- Scope by data categories and time ranges before querying
- Prefer deterministic methods (structured queries, aggregations)
- Document your query strategy for reproducibility
- Retrieve summary data first, drill into detail only for top
contributors

9. **Remediation must be specific**:
- Provide concrete fix recommendations (e.g., specific registry
keys, power settings, driver configuration, scheduled task
changes, service configuration, `powercfg` commands), not
vague advice
- Assess the risk of each proposed fix
- Identify monitoring or alerting that would have caught this
earlier
- Suggest defensive measures to prevent recurrence

## Analysis Steps

Process the trace systematically using iterative deepening:

1. **Process the trace** with relevant data categories (e.g., CPU
sampling, energy estimation, disk I/O, processor frequency,
interrupt handling, processor idle states, device power state,
process metadata, services)
2. **Broad survey**: Query top consumers by primary metric (CPU
samples, energy estimation, disk I/O bytes) grouped by process.
Rank by impact.
3. **Call stack analysis**: For the top 5–10 consumers, obtain call
stacks. Identify dominant call chains to understand *what* each
process was actually doing.
4. **Divergence check**: Compare CPU percentage vs. energy percentage
for each top consumer. Flag disproportionate energy consumers.
5. **Cross-process tracing**: Identify amplification cascades where
one process's activity triggers work in others.
6. **Supplementary analysis**: Check for:
- Timer resolution requests preventing deep idle states
- Interrupt/DPC activity and wake sources
- Disk I/O patterns during expected-idle periods
- Power state transitions and frequency scaling
- Background service and scheduled task activity
- Network-related wake events
7. **Synthesize**: Combine all layers into a coherent root cause
analysis with causal chains.

## Non-Goals

Explicitly define what is OUT OF SCOPE for this investigation.
State each non-goal clearly so the investigation does not expand
beyond its intended boundaries. Examples:

- Do NOT investigate application-level bugs in the processes found —
only identify them as contributors and recommend actions.
- Do NOT attempt to modify system configuration directly — only
recommend changes.
- Do NOT investigate hardware defects (e.g., battery health,
component failures).

Adjust these non-goals based on the specific investigation context
provided in {{problem_description}}.

## Investigation Plan

Before beginning analysis, produce a concrete step-by-step plan
tailored to this specific investigation. The plan should:

1. **Identify data categories**: Which trace data categories are
relevant to this investigation?
2. **Define time ranges**: What time periods are relevant (idle
periods, workload periods, transitions)?
3. **Enumerate metrics**: What metrics will be queried at each
iterative deepening layer?
4. **Plan cross-process analysis**: Which processes are likely
to interact, and what causal chains should be checked?
5. **Report**: Produce the output according to the specified format.

This plan replaces ad-hoc exploration with systematic analysis.

## Quality Checklist

Before finalizing, verify:

- [ ] Every finding cites specific evidence from the trace (process
name, PID, metric values, timestamps, and call stacks when
available; if stack data is unavailable, document what would
be needed to obtain it)
- [ ] Every finding has a severity rating with justification
- [ ] Root cause is identified, not just the proximate trigger
- [ ] Iterative deepening completed: broad survey → module → stack →
cross-process for the top contributors (up to 5), limited by
available meaningful contributors and stack data
- [ ] Energy-vs-CPU divergence checked for top consumers where both
energy and CPU data are available
- [ ] Cross-process amplification cascades documented where present
- [ ] Remediation recommendations are specific and actionable
- [ ] At least 3 findings have been re-verified against the trace data
- [ ] Coverage statement documents what data categories were and were
not examined, including any limitation where fewer than 5
contributors were analyzable or stack/energy data was unavailable
- [ ] No fabricated process names, PIDs, or metric values — unknowns
marked with [UNKNOWN: <what is missing>]
Loading