Mini Agent Harness for Learning

This repository is an educational but working Python Agent Harness. It keeps dependencies near zero so the moving parts stay visible:

Agent = Model + Tools + Skills + Memory + Permissions + Trace + Context + Eval

It is not trying to replace LangGraph, OpenAI Agents SDK, CrewAI, or Claude Code. It is a small framework for learning how those systems are built.

Features

OpenAI-compatible /v1/chat/completions client with retries, timeouts, redacted errors, provider-compatible tool-call parsing, and mock clients.
Tool registry with calculator, workspace file tools, safer patch-style editing, grep, git status/diff, and permission-gated shell execution.
File-based skills in skills/*.md with metadata, trigger matching, priority, and advisory allowed/risky tool lists.
Context compression that truncates large tool outputs and keeps a simple state summary.
JSONL traces, a trace analyzer, and an eval runner with isolated workspaces.
MCP adapter and multi-agent orchestration skeletons.
Offline lab-report and paper-reading examples for research workflows.

Install

Python 3.10+ is recommended.

python -m venv .venv
.venv\Scripts\activate
pip install -e .
pip install pytest

The package itself has no required third-party dependencies.

Configure A Model

Copy .env.example values into your shell or environment manager:

set MODEL_API_KEY=your_api_key_here
set MODEL_BASE_URL=https://api.openai.com/v1
set MODEL_NAME=gpt-4.1-mini

Optional tuning:

set MODEL_TIMEOUT_SECONDS=120
set MODEL_MAX_TOKENS=2048
set MODEL_TEMPERATURE=0.2
set MODEL_MAX_RETRIES=2

Never commit real API keys. run_shell is dangerous and is not auto-approved by default. Leave AGENT_AUTO_APPROVE_DANGEROUS=0 unless you are in a trusted local demo workspace.

Run Demos

python examples/quickstart.py
python examples/research_note_agent.py
python examples/coding_agent_demo.py
python examples/lab_report_agent.py
python examples/paper_reading_agent.py

Model-backed examples need MODEL_API_KEY. The lab-report and paper-reading examples are deterministic offline workflows.

Tools

Core built-in tools from build_builtin_tools(workspace):

calculate(expression)
read_text(path)
read_text_range(path, start_line, end_line)
write_text(path, content)
replace_text(path, old, new, expected_replacements=1)
append_text(path, content)
list_dir(path=".")
grep(pattern, path=".")
git_status()
git_diff(path=".")
run_shell(command, timeout_seconds=20)

Patch-style editing is safer than whole-file rewriting because replace_text requires an exact match count. It refuses zero matches and unexpected duplicate matches, which prevents many accidental broad rewrites.

Skills

Skills are reusable task procedures, not executable tools. The default skill files live in skills/ and are loaded by build_default_skills().

---
name: "Research Note"
description: "Read or synthesize material into a structured research note."
triggers: ["paper", "research"]
priority: 40
allowed_tools: ["read_text", "grep"]
risky_tools: []
---
Instructions go here.

See docs/skills.md.

Eval And Trace

Run tests:

python -m pytest

Run evals with a configured model:

python -m agent_harness.eval_runner evals/sample_tasks.jsonl

Analyze a trace:

python -m agent_harness.trace_analyzer runs/quickstart_trace.jsonl

Trace and eval outputs are generated under runs/ and are ignored by Git.

Project Structure

agent_harness/
  agent.py             # Agent loop
  builtin_tools.py     # Built-in workspace, edit, git, shell tools
  context.py           # Context compression and state summaries
  eval_runner.py       # JSONL eval runner
  mcp_adapter.py       # MCP-to-Tool adapter boundary
  model_client.py      # OpenAI-compatible client and mock clients
  multi_agent.py       # Manager/worker orchestration skeleton
  skills.py            # Skill model and file loader
  trace.py             # JSONL trace writer
examples/              # Demo workflows
docs/                  # Design and extension notes
skills/                # File-based default skills
tests/                 # Focused pytest coverage

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agent_harness		agent_harness
docs		docs
evals		evals
examples		examples
skills		skills
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mini Agent Harness for Learning

Features

Install

Configure A Model

Run Demos

Tools

Skills

Eval And Trace

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mini Agent Harness for Learning

Features

Install

Configure A Model

Run Demos

Tools

Skills

Eval And Trace

Project Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages