Workflow Runner

A config-driven workflow engine for automating startup operations. Define multi-step pipelines in JSON — swap prompts, tools, guardrails, and models without code changes.

Quick Start

# 1. Install dependencies
uv sync

# 2. Set up environment
cp .env.example .env
# Add your OPENAI_API_KEY to .env

# 3. Run the server
uv run python main.py

Server starts at http://localhost:8000.

Docker

cp .env.example .env
# Add your OPENAI_API_KEY to .env

docker compose up --build

API

POST /submit

Submit a workflow job. Returns immediately with a job ID.

curl -X POST http://localhost:8000/submit \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Where is my postcard? Order ORD-123",
    "workflow": "support-routing",
    "metadata": {"email": "customer@example.com"}
  }'

Response:

{"job_id": "abc-123", "status": "pending"}

GET /status/{job_id}

Poll for results. Returns classification, agent result, and full audit log.

curl http://localhost:8000/status/abc-123

GET /health

curl http://localhost:8000/health

Architecture

POST /submit
    |
    v
Load config (JSON)
    |
    v
Background thread runs steps sequentially:
    |
    +-- [validation]     -- deterministic input checks
    +-- [pii_detection]  -- regex-based PII redaction
    +-- [llm]            -- classifier (structured JSON output)
    +-- [guardrail]      -- rule-based gate (can escalate/fail early)
    +-- [agent]          -- LLM agent with tool calling
    +-- [action]         -- direct tool execution
    |
    v
Result saved to SQLite + audit log
    |
    v
GET /status/{job_id}  -->  client polls for result

Step Types

Type	Purpose	Uses LLM?
`validation`	Input length/empty checks	No
`pii_detection`	Regex-based PII redaction (email, phone, SSN)	No
`llm`	Classification, extraction, scoring	Yes
`guardrail`	Rule-based gate on prior step output	No
`agent`	Tool-calling agent (retries + fallback)	Yes
`action`	Direct tool call with state-driven args	No

Config Format

Workflows are defined in pipeline/configs/. Each config specifies an ordered list of steps:

{
  "id": "support-routing",
  "name": "Customer Support Router",
  "steps": [
    {"id": "validate", "type": "validation", "config": {"max_input_length": 5000}},
    {"id": "classify", "type": "llm", "config": {"prompt": "...", "model": "gpt-4o-mini"}},
    {"id": "check_critical", "type": "guardrail", "config": {"field": "priority", "operator": "eq", "value": "critical", "action": "escalate"}},
    {"id": "handle", "type": "agent", "config": {"system_prompt": "...", "model": "gpt-4o", "tools": ["search_faq", "save_ticket"]}}
  ],
  "guardrails": {"max_retries": 3, "fallback_model": "gpt-4o-mini"}
}

Two configs included:

support-routing.json — customer support ticket routing (Operations pillar)
lead-qualify.json — inbound lead scoring and routing

Running Evals

uv run python -m pipeline.eval

Runs 5 golden tests across both workflows. Checks keyword presence, expected tool usage, and job completion.

Project Structure

pipeline/
  main.py           # FastAPI server (POST /submit, GET /status, GET /health)
  runner.py          # Step-based workflow engine with handler registry
  tools.py           # Tool implementations + TOOL_REGISTRY
  db.py              # SQLite job storage with JSON audit log
  schemas.py         # Pydantic models
  eval.py            # Golden test runner
  configs/           # Workflow JSON configs
  evals/             # Test fixtures
  tests/             # Unit tests (31 tests, no LLM calls)

Demo

With the server running, execute the demo script to see all scenarios:

uv run python demo.py

Runs 4 scenarios: order tracking, PII redaction + refund, validation failure, and lead qualification. Shows classifications, results, and audit logs with color-coded output.

Running Tests

uv run pytest pipeline/tests/test_core.py -v

Covers DB operations, tools, validation, PII detection, guardrails, step handler registry, and config loading. All tests are deterministic (no API keys needed).

Observability

Set these in .env for LangSmith tracing (optional):

LANGSMITH_TRACING=true
LANGSMITH_API_KEY=ls-...

All steps are also logged to the SQLite audit log with input, output, status, and latency.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.agent/skills		.agent/skills
.agents/skills		.agents/skills
.claude/skills		.claude/skills
pipeline		pipeline
skills		skills
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
WRITEUP.md		WRITEUP.md
demo.py		demo.py
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
skills-lock.json		skills-lock.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workflow Runner

Quick Start

Docker

API

POST /submit

GET /status/{job_id}

GET /health

Architecture

Step Types

Config Format

Running Evals

Project Structure

Demo

Running Tests

Observability

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Workflow Runner

Quick Start

Docker

API

POST /submit

GET /status/{job_id}

GET /health

Architecture

Step Types

Config Format

Running Evals

Project Structure

Demo

Running Tests

Observability

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages