review: anonymize author identity, calibrate severity, surface subprocess errors#232
review: anonymize author identity, calibrate severity, surface subprocess errors#232ftabba wants to merge 3 commits into
Conversation
A commit message carries author-reputation cues (the From/author header
and Signed-off-by, Reviewed-by, Acked-by and similar trailers) that name
well-known contributors. A language-model reviewer can defer to those
cues ("X wrote this, so it is probably correct") and soften its
analysis, which is the opposite of what review should do.
Add a `[review] anonymize_authors` flag, defaulting on. When enabled,
the reviewer's view of each patch is stripped of author-identity cues
before it reaches the model:
- the author field is dropped,
- identity trailer lines (From, Author, Signed-off-by, Reviewed-by,
Acked-by, Tested-by, Co-developed-by, Co-authored-by, Reported-by,
Suggested-by, Cc) are removed from the commit-message region of the
diff, the `git show` text, and the full commit message,
- e-mail addresses in that region are redacted.
The strip acts only on the commit-message region: everything from the
first `diff --git` line onward is left byte-for-byte unchanged, so the
code the model reads and quotes is identical whether the flag is on or
off. It also acts only on the copy assembled for the model. The stored
patch, the patch applied to the worktree, e-mail routing and MAINTAINERS
handling (which use the parsed author on a separate path), and the
public review report are all unaffected.
The flag defaults on, so a reputation-blind review is the out-of-the-box
behaviour and an operator opts back into showing identity by setting
`anonymize_authors = false`.
Assisted-by: Antigravity:gemini-3.1-pro
Signed-off-by: Fuad Tabba <tabba@google.com>
The reviewer collapses every finding onto a single Critical/High/Medium/
Low label in one step, with only a free-form reasoning string to justify
it. In practice this over-states severity: a finding gets Critical for a
bad-sounding consequence even when the code is barely reachable or the
reviewer is only speculating that the triggering condition can occur.
"How bad if true", "how reachable", and "how sure" all collapse into the
one label, and the first tends to dominate.
Make the reasoning explicit before the label is chosen. severity.md now
asks the reviewer to state three independent axes and then collapse them:
- Reachability: unprivileged / privileged / guest /
internal / unreachable.
- Consequence: crash / data-corruption / info-leak / perf /
commit-msg-only / style.
- Confidence: low / medium / high (high requires a concrete
reproducible path with every precondition named).
with collapse rules that cap the inflation directly: an unreachable or
commit-msg-only/style finding is at most Low however bad the consequence
would be, low confidence caps the label at Medium, and Critical/High
needs a damaging consequence AND real reachability AND medium-or-better
confidence together. The Stage 10 prompt requires the three axis values
and the collapse to be recorded at the start of the existing
severity_explanation field.
This is a pure prompt/rubric change. The four labels, the output schema,
and the public report are unchanged: the axes live as reasoning inside
the existing explanation field, so there is no migration and no new
output. The goal is a more honest single label, not a new label set.
Assisted-by: Antigravity:gemini-3.1-pro
Signed-off-by: Fuad Tabba <tabba@google.com>
`sashiko-cli local` runs the review in a subprocess and streams its stderr only to drive the four-phase progress display, discarding every other line. When the subprocess failed before producing a result, the user saw only "Review subprocess produced no output (exit code: N)" with the actual cause (a stage error, an auth or quota failure, a panic) silently dropped. Keep a bounded tail (the last 100 lines) of the subprocess's stderr while rendering progress, and replay it on the no-output failure path. Successful runs are unchanged (clean progress), while failures now explain themselves. Assisted-by: Antigravity:gemini-3.1-pro Signed-off-by: Fuad Tabba <tabba@google.com>
8eae9e3 to
09861b7
Compare
|
Have you considered that LLMs may be able to easily circumvent identity anonymization? They're pretty good at guessing. They can now correctly answer multiple-choice questions without even being given the question. My anecdata: I randomly took one of Christoph Hellwig's patches, stripped all author references, and asked Gemini Flash 3.5 to guess the author without searching the web. It guessed correctly.Below is a patch from the Linux kernel mailing list. determine who is the most likely author. DO NOT SEARCH THE WEB TO FIND THE ANSWERBased on the code, architectural goals, and specific phrasing in the commit message, the most likely author of this patch is Christoph Hellwig (often known in the kernel community as hch). Here is a breakdown of the clues that point to him: Subsystem Boundary Enforcement: The primary goal of the patch is architectural cleanup—hiding internal Memory Management (mm/) structures (swap_iocb) from the Direct Rendering Manager (drm/) drivers. Christoph Hellwig is highly prolific and well-known for enforcing strict subsystem boundaries and cleaning up cross-subsystem code. The Phrase "shmem abuses in drm": Calling out driver or subsystem "abuses" (especially when graphics drivers bend core MM or block-layer rules) is a trademark style of his direct, no-nonsense commit messages. Swap I/O Refactoring: He has been heavily involved in rewriting and refactoring the Linux kernel's swap I/O path, block layer integrations, and removing legacy or leaking interfaces. The Typos: The short, blunt description containing minor typos ("preparare", "keepin") is characteristic of his fast-paced, high-volume patch submissions. |
I did, but still I think this adds value. Shorter commits are more difficult to distinguish, and it would require some active work on part of the LLM (which might not be a good thing :) ). |
|
Thank you, Fuad, for the pull request! First, please, don't mix orthogonal changes into a single PR.
Thanks! |
Sure.
Sure, I'll drop it.
Confidence axis: Yes, LLM self-assessed confidence is shaky. Here it's just a cap rule (low confidence can't produce Critical/High), not a primary signal. If it turns out models just default to "medium" for everything, the reachability axis alone still does most of the work. FP rate: The rubric doesn't dismiss findings after the fact. The mechanism is upstream: forcing the model to reason about reachability and confidence before committing to a finding causes it to self-filter speculative ones earlier. That said, the improvement (0.54 → 0.44) is across 2 runs on one 40-commit series, so I can't cleanly separate the rubric effect from variance. Distribution: I phrased that badly. Not aiming for even distribution. The baseline just labeled nearly everything Critical/High, including dead-code observations and commit-message nitpicks. The rubric makes the labels more accurate, not more uniform. Medium/Low FP: Agree, that's where the noise is. The reachability axis targets exactly this: making the model state "who can actually reach this" before picking a label tends to deflate the speculative stuff. If the confidence axis is a concern, I can drop it and keep just reachability + consequence. Reachability is the axis with the clearest signal, and a simpler two-axis rubric is easier to validate.
Will do.
|
Three improvements to the review pipeline: author anonymization, severity
calibration, and better error visibility on subprocess failures.
Changes
1. Anonymize author identity in the model's view (default on)
Strip author-reputation cues (Signed-off-by, Reviewed-by, the author
header, e-mail addresses) from the commit-message region before it
reaches the model. The diff body, e-mail routing, stored patches, and
the public report are all unaffected. Fixes: trailers are deliberately
preserved as they carry technical provenance (SHA + description) the
model needs to trace code history.
Motivation: in separate testing outside this eval set, models
consistently deferred to well-known maintainer names and softened their
analysis compared to the same patch attributed to lesser-known
contributors. The stripping removes only identity cues; nothing
structurally important to the review is lost.
Gated by
[review] anonymize_authors(default true). Set to false torestore the previous behaviour.
2. Calibrate severity through three explicit axes
The reviewer previously collapsed every finding onto a single
Critical/High/Medium/Low label in one step, which in practice
over-stated severity: "how bad if true" dominated over "how reachable"
and "how confident". The new severity.md rubric requires the model to
reason through three independent axes (reachability, consequence,
confidence) before collapsing to a label, with explicit cap rules
(e.g., unreachable findings cap at Low, low-confidence caps at Medium).
Eval results on a 40-commit KVM/arm64 pKVM series with 5 confirmed
ground-truth bugs:
near-uniform Critical/High; calibrated runs use the full range)
Pure prompt/rubric change. The four labels, output schema, and report
format are unchanged.
3. Surface subprocess stderr on failure
sashiko-cli localpreviously swallowed the subprocess's stderr onfailure, showing only "no output (exit code: N)". Now keeps a bounded
100-line tail and replays it on failure so the actual cause (stage
error, auth failure, panic) is visible. Successful runs are unchanged.
Assisted-by: Antigravity:gemini-3.1-pro
Signed-off-by: Fuad Tabba tabba@google.com