benchmodel

Test, version, and ship prompts for any open source LLM. Runs locally.

Why Benchmodel

Local first. Connects to anything that speaks the OpenAI chat API or the native Ollama API. Models, prompts, and runs stay on your machine.
Versionable prompts. Collections live as YAML or JSON files. Commit them to Git, review them in pull requests, diff them like code.
Prompts as HTTP endpoints. Pin a default provider and model on a prompt, and call it from anywhere with POST /api/prompts/<id>/invoke. Body in, { "output": "..." } out.

Screenshot

Quick start

Docker

docker compose up
# open http://localhost:3737

npm

npx benchmodel
# data is stored in ~/.benchmodel/data.db
# open http://localhost:3737

Local development

pnpm install
pnpm dev
# open http://localhost:3737

Collection schema

Collections are plain YAML or JSON files. Both formats share the same shape and are validated with Zod on import.

name: "Customer support classifier"           # required, free text
description: "Tests for routing tickets"      # optional
prompts:                                      # required, at least one entry
  - name: "classify_ticket"                   # required, used in the UI
    system: "You are a classifier..."         # optional, system message
    user: "Classify this ticket: {{ticket}}"  # required, supports {{variables}}
    variables:                                # optional, default values
      ticket: "My order is late"
    assertions:                               # optional, run on every output
      - type: contains
        value: "shipping"
      - type: regex
        pattern: "ABC-\\d+"
        flags: "i"
      - type: json_schema
        schema:
          type: object
          properties:
            category: { type: string }
            priority: { type: string }

Three assertion types are supported in MVP: contains, regex, and json_schema (validated with Ajv).

See full examples in examples/collection.example.yaml and examples/collection.example.json.

Calling a prompt as an API

Set the default provider and model on the prompt (in the editor, Binding section), save, then call it.

curl -X POST http://localhost:3737/api/prompts/<id>/invoke \
  -H 'content-type: application/json' \
  -d '{"variables": {"ticket": "My order is late"}}'

Response:

{ "output": "{\"category\":\"shipping\",\"priority\":\"high\"}" }

The endpoint is open by design (local first). Variables in the request body override the defaults stored on the prompt.

How it compares

Tool	Local first	Versionable prompts	Streaming Playground	Open source
Benchmodel	yes	yes (YAML or JSON in Git)	yes	yes (MIT)
Open WebUI	yes	partial	yes	yes
Promptfoo	yes	yes (CLI focused)	no	yes
OpenRouter	no (cloud)	no	yes	no
LM Studio	yes	no	yes	no (proprietary)

Roadmap

MVP (current)

Native and OpenAI compatible providers (Ollama, vLLM, llama.cpp, LM Studio, Together, Groq, and more)
Collections in YAML and JSON
Variables and three assertion types
Streaming Playground with stop and save as prompt
Per prompt HTTP invoke endpoint
Run history with model, status, and time range filters
Hover quick run on prompt cards
Dashboard with real CPU and memory info

v2

LLM as judge assertions
Function calling and tool use
Multi user with roles
OpenTelemetry tracing for runs
Embedded Git sync for collections
Eval suites (run a whole collection at once)

Contributing

We love contributions. See CONTRIBUTING.md for setup, conventions, and the PR checklist.

Look for issues labeled good first issue or help wanted to get started.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
bin		bin
docs/screenshots		docs/screenshots
drizzle		drizzle
examples		examples
public		public
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
components.json		components.json
docker-compose.yml		docker-compose.yml
drizzle.config.ts		drizzle.config.ts
next.config.mjs		next.config.mjs
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

benchmodel

Why Benchmodel

Screenshot

Quick start

Docker

npm

Local development

Collection schema

Calling a prompt as an API

How it compares

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

benchmodel

Why Benchmodel

Screenshot

Quick start

Docker

npm

Local development

Collection schema

Calling a prompt as an API

How it compares

Roadmap

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages