Skip to content

Add GWS-powered tasks using fws mock server#172

Open
juppytt wants to merge 1 commit intopinchbench:mainfrom
juppytt:fws-gws-tasks
Open

Add GWS-powered tasks using fws mock server#172
juppytt wants to merge 1 commit intopinchbench:mainfrom
juppytt:fws-gws-tasks

Conversation

@juppytt
Copy link
Copy Markdown
Contributor

@juppytt juppytt commented Apr 8, 2026

Summary

Add 4 tasks that use gws and gh CLIs against fws (local mock server). Enables realistic Google Workspace and GitHub testing without OAuth.

GWS tasks:

  • task_26_gws_email_triage: List unread emails, read them, draft a reply, write a triage report
  • task_27_gws_cross_service: Read an email, create a calendar event, share a Drive document
  • task_28_gws_task_management: Review task list, read emails for action items, create new tasks

GitHub tasks:

  • task_29_gh_issue_triage: List issues/PRs, read them, comment on the most critical, write a triage report

Runner integration auto-starts/stops fws for tasks with category: gws or category: github.

Setup

npm install -g @googleworkspace/cli   # gws CLI
npm install -g @juppytt/fws           # fws mock server
# gh CLI assumed already installed

Changes

  • tasks/task_26_gws_email_triage.md - new task
  • tasks/task_27_gws_cross_service.md - new task
  • tasks/task_28_gws_task_management.md - new task
  • tasks/task_29_gh_issue_triage.md - new task
  • scripts/lib_fws.py - fws lifecycle management (start/stop/env)
  • scripts/lib_agent.py - auto-start fws for gws/github tasks, fix transcript parsing for toolCall/exec format, fix max_completion_tokens for OpenAI judge

Test plan

  • Ran task_26 with openai/gpt-5.4-mini, agent used gws CLI correctly
  • fws server starts/stops correctly around task execution
  • Grading correctly detects gws/gh commands in transcript

Ref: #119

GWS tasks (using gws CLI):
- task_26: Email Triage (list, read, draft reply, write report)
- task_27: Cross-Service Workflow (email -> calendar event -> drive share)
- task_28: Task Management (read emails, extract action items, create tasks)

GitHub tasks (using gh CLI):
- task_29: GitHub Issue Triage (list issues/PRs, comment, write report)

Runner integration:
- lib_fws.py: start/stop fws server for category=gws and category=github
- lib_agent.py: auto-start fws, fix transcript parsing, fix max_completion_tokens

Install: npm install -g @juppytt/fws (also requires gws and gh CLIs)

Ref: pinchbench#119
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant