Skip to content

Add experiment progress and output logging#25

Draft
Copilot wants to merge 2 commits into
mainfrom
copilot/update-agent-eval-experiment-output
Draft

Add experiment progress and output logging#25
Copilot wants to merge 2 commits into
mainfrom
copilot/update-agent-eval-experiment-output

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 22, 2026

Experiment runs now report queue progress, capture command output as artifacts, and avoid writing run output directly to stdout unless requested. Error-level run logs still surface to aid debugging.

  • Progress events

    • Report completed, running, remaining, and total treatment counts while experiments run.
  • Output hooks

    • Add sandbox command output hooks so callers can observe stdout/stderr without implicit process writes.
    • Route treatment logs through runner events with info, error, and raw output event types.
  • Log artifacts

    • Persist per-treatment command output to:
      • stdout.log
      • stderr.log
  • CLI output control

    • Add --show-output to mirror command output to stdout.
    • Prefix mirrored output with experiment, treatment, eval, model, and stream context.
    • Always emit error logs regardless of --show-output.
agent-eval --experiment mcp --concurrency 2 --show-output
# [MCP | Control | 001-agent-uses-button-from-primer | gpt-5.5 | stdout] ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants