Fix grader compatibility with OpenClaw transcripts by jijivski · Pull Request #86 · pinchbench/skill

jijivski · 2026-04-01T05:42:00Z

Improve grader compatibility with current OpenClaw transcripts

The grader currently assumes a narrower transcript format than the one produced by current OpenClaw runtime, which can lead to false negatives.

Changes:

read tool inputs from toolCall.arguments
support file alongside path / file_path
improve judge score parsing robustness

These changes do not alter task requirements; they only make grading align with real transcript output.

ScuttleBot

ScuttleBot review 🦀

Solid defensive fix. The grader was too rigid about transcript formats, causing false negatives on valid runs.

What's good:

_coerce_score_value() handles the full zoo of judge response formats (nested dicts, string numbers, boolean rejection)
Supporting file alongside path/file_path aligns with how OpenClaw actually emits tool calls
The refactor into _extract_named_scores() and _extract_total_score() is cleaner than the previous inline conditionals

One question:

Task file changes (task_08, task_10, task_18) — are these tested against transcripts from multiple agents? The file param support looks correct but I want to confirm this doesn't break Cursor/Windsurf/Claude Code grading.

Otherwise LGTM. This will reduce the "score 0 but the agent clearly did the work" cases.

ScuttleBot · 2026-04-06T18:12:57Z

Merge conflict resolution available

I've rebased this PR onto main and resolved the conflict in lib_grading.py. The conflict was between the new _parse_judge_text() function (added in main via #87) and the helper functions in this PR (_coerce_score_value, _extract_named_scores, _extract_total_score).

Resolution: Keep both — _parse_judge_text() first, then the helper functions. Both are needed.

@jijivski — could you rebase your branch onto main? The resolution is straightforward:

git fetch upstream
git rebase upstream/main
# Resolve lib_grading.py by keeping both function sets
git add scripts/lib_grading.py
git rebase --continue
git push --force-with-lease

Alternatively, @olearycrew has admin access and can use GitHub's "Update branch" button if the repo allows maintainer edits on this PR.

olearycrew · 2026-04-06T18:21:09Z

@jijivski can you take a look at the conflicts here?

Fix grader compatibility with OpenClaw transcripts

257cda6

ScuttleBot reviewed Apr 6, 2026

View reviewed changes

ScuttleBot mentioned this pull request Apr 6, 2026

Clean up some recent changes #83

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix grader compatibility with OpenClaw transcripts#86

Fix grader compatibility with OpenClaw transcripts#86
jijivski wants to merge 1 commit intopinchbench:mainfrom
jijivski:fix/openclaw-transcript-compat-v2

jijivski commented Apr 1, 2026

Uh oh!

ScuttleBot left a comment

Uh oh!

ScuttleBot commented Apr 6, 2026

Uh oh!

olearycrew commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jijivski commented Apr 1, 2026

Uh oh!

ScuttleBot left a comment

Choose a reason for hiding this comment

Uh oh!

ScuttleBot commented Apr 6, 2026

Uh oh!

olearycrew commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants