A multi-provider LLM tool and SDK for Go.
Interact with large language models from Google Gemini, Anthropic Claude, and Mistral through a
unified CLI, HTTP API server, or Go SDK. Supports stateless completion (ask), stateful
multi-turn conversations (chat), embeddings, tool use, MCP integration, and a Telegram bot.
- Multi-Provider: Unified interface across Google Gemini, Anthropic Claude, and Mistral
- CLI: Interactive terminal UI chat with markdown rendering, single-shot ask, and embedding generation
- HTTP API Server: REST endpoints for models, sessions, tools, chat, ask, and embeddings with SSE streaming
- Session Management: Stateful multi-turn conversations with persistent storage (file-backed JSON)
- Tool Use: Extensible tool framework with built-in integrations for Home Assistant, NewsAPI, and WeatherAPI
- MCP Server: Model Context Protocol (stdio JSON-RPC 2.0) server exposing registered tools and prompts
- Telegram Bot: Full-featured bot with per-conversation sessions, file/image attachments, and slash commands
- Attachments: Send files, images, and URLs as context in chat and ask
- Structured Output: JSON schema-based structured output for ask
- Thinking/Reasoning: Extended thinking support with configurable budgets (Anthropic, Gemini)
- OpenTelemetry: Distributed tracing support for the HTTP API
The server provides the REST API for all providers and tools. Run it with Docker:
docker volume create go-llm
docker run -d --name go-llm \
-v go-llm:/data -p 8085:8085 \
-e GEMINI_API_KEY="your-key" \
ghcr.io/mutablelogic/llm runThe client-only CLI can be downloaded from the releases page. It does not include the server or the Telegram bot. Point it at a running server:
export LLM_ADDR="localhost:8085"
# List providers and models
llm providers
llm models
# Single-shot completion
llm ask --model gemini-2.0-flash "Explain KV cache in two sentences"
# Interactive chat (launches terminal UI)
llm chat --model gemini-2.0-flash
# Single-shot chat with session persistence
llm chat --model gemini-2.0-flash "What is the capital of France?"
# Generate embeddings
llm embedding --model text-embedding-004 "Hello, world"The Telegram bot runs as a sidecar to the server, providing chat access via Telegram. It requires a bot token from @BotFather:
docker run -d --name go-llm-telegram \
-e LLM_ADDR="your-server:8085" \
-e TELEGRAM_TOKEN="your-bot-token" \
ghcr.io/mutablelogic/llm telegram --model gemini-2.0-flashAPI keys are configured on the server via flags or environment variables:
| Provider | Flag | Env Variable | Models |
|---|---|---|---|
| Google Gemini | --gemini-api-key |
GEMINI_API_KEY |
Gemini 2.0 Flash, Flash Lite, embedding models, etc. |
| Anthropic Claude | --anthropic-api-key |
ANTHROPIC_API_KEY |
Claude Sonnet, Haiku, Opus, etc. |
| Mistral | --mistral-api-key |
MISTRAL_API_KEY |
Mistral Large, Small, embedding models, etc. |
The following tools are included as examples of how to build tool integrations with the SDK:
| Tool | Flag | Env Variable | Description |
|---|---|---|---|
| Home Assistant | --ha-endpoint, --ha-token |
HA_ENDPOINT, HA_TOKEN |
Query and control smart home devices |
| NewsAPI | --news-api-key |
NEWS_API_KEY |
Search news articles and sources |
| WeatherAPI | --weather-api-key |
WEATHER_API_KEY |
Current weather and forecasts |
| Flag | Env Variable | Default | Description |
|---|---|---|---|
--http.addr |
LLM_ADDR |
localhost:8085 |
HTTP listen address |
--http.prefix |
— | /api |
HTTP path prefix |
--http.timeout |
— | 15m |
HTTP read/write timeout |
--http.origin |
— | (same-origin) | CORS origin (* to allow all) |
--tls.name |
— | — | TLS server name |
--tls.cert |
— | — | TLS certificate file |
--tls.key |
— | — | TLS key file |
| Flag | Env Variable | Description |
|---|---|---|
--otel.endpoint |
OTEL_EXPORTER_OTLP_ENDPOINT |
OpenTelemetry collector endpoint |
--otel.header |
OTEL_EXPORTER_OTLP_HEADERS |
OpenTelemetry collector headers |
--otel.name |
OTEL_SERVICE_NAME |
OpenTelemetry service name |
Sessions and defaults are stored in $XDG_CACHE_HOME (or system temp if unset).
The llm command-line tool can connect to any running go-llm server (set LLM_ADDR or --http.addr to point at it) and provides commands for generating text, managing models and sessions, inspecting tools, and running the server itself.
| Command | Description | Example |
|---|---|---|
ask |
Stateless single-shot completion | llm ask --model gemini-2.0-flash --file "*.go" "Summarize this source code" |
chat |
Stateful chat (terminal UI or single-shot) | llm chat --model gemini-2.0-flash |
embedding |
Generate embedding vectors | llm embedding --model text-embedding-004 "text" |
| Command | Description | Example |
|---|---|---|
providers |
List available providers | llm providers |
models |
List models | llm models |
model |
Get model details | llm model gemini-2.0-flash |
| Command | Description | Example |
|---|---|---|
sessions |
List sessions | llm sessions |
session |
Get session details | llm session <id> |
create-session |
Create a new session | llm create-session gemini-2.0-flash |
update-session |
Update session metadata | llm update-session <id> |
delete-session |
Delete a session | llm delete-session <id> |
| Command | Description | Example |
|---|---|---|
tools |
List registered tools | llm tools |
tool |
Get tool details | llm tool <name> |
| Command | Description | Example |
|---|---|---|
run |
Start the HTTP API server | llm run |
telegram |
Run as a Telegram bot | llm telegram --model gemini-2.0-flash |
Use llm --help or llm <command> --help for full options.
The chat command launches a terminal UI with markdown rendering, streaming responses, and slash commands:
| Command | Description |
|---|---|
/model |
Switch model |
/models |
List available models |
/providers |
List providers |
/session |
Show current session |
/sessions |
List sessions |
/name |
Rename current session |
/label |
Set session label |
/system |
Set system prompt |
/thinking |
Toggle thinking/reasoning |
/tools |
Toggle tool use |
/file |
Attach a file |
/url |
Attach a URL |
/reset |
Reset conversation |
/delete |
Delete current session |
/help |
Show help |
%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
subgraph Clients["Clients"]
direction TB
CLI["`**cmd/llm**
CLI Tool`"]
TUI["`**pkg/ui/bubbletea**
Terminal UI`"]
SDK["`**pkg/httpclient**
SDK Client`"]
Telegram["`**pkg/ui/telegram**
Telegram Bot`"]
end
subgraph Server["go-llm Server"]
direction TB
API["`**pkg/httphandler**
REST API`"]
Agent["`**pkg/agent**
Manager`"]
Sessions["`**pkg/session**
Memory, File`"]
Tools["`**pkg/tool**
Tool Registry`"]
MCP["`**pkg/mcp**
MCP Server`"]
end
subgraph Providers["LLM Providers"]
direction TB
Gemini["`**pkg/provider/google**
Google Gemini`"]
Claude["`**pkg/provider/anthropic**
Anthropic Claude`"]
Mistral["`**pkg/provider/mistral**
Mistral`"]
end
subgraph ToolImpls["Example Tools"]
direction TB
HA["`**pkg/homeassistant**
Home Assistant`"]
News["`**pkg/newsapi**
NewsAPI`"]
Weather["`**pkg/weatherapi**
WeatherAPI`"]
end
CLI --> API
TUI --> API
SDK --> API
Telegram --> API
API --> Agent
Agent --> Sessions
Agent --> Tools
Agent --> Gemini
Agent --> Claude
Agent --> Mistral
Tools --> HA
Tools --> News
Tools --> Weather
MCP --> Tools
The Go SDK provides interfaces for building LLM-powered applications:
package main
import (
"context"
"fmt"
"log"
llm "github.com/mutablelogic/go-llm"
"github.com/mutablelogic/go-llm/pkg/provider/google"
)
func main() {
ctx := context.Background()
// Create a provider
client, err := google.New(ctx, "your-api-key")
if err != nil {
log.Fatal(err)
}
// Get a model
model, err := client.GetModel(ctx, "gemini-2.0-flash")
if err != nil {
log.Fatal(err)
}
// Stateless completion
gen := model.(llm.Generator)
response, err := gen.WithoutSession(ctx, "Explain KV cache in two sentences")
if err != nil {
log.Fatal(err)
}
fmt.Println(response)
}| Package | Description |
|---|---|
pkg/agent |
Central manager orchestrating providers, sessions, and tools |
pkg/provider/{google,anthropic,mistral} |
Provider implementations |
pkg/session |
Session storage backends (in-memory, file-backed JSON) |
pkg/tool |
Tool interface and toolkit registry |
pkg/schema |
Core types (Model, Message, ContentBlock, Attachment, Session, etc.) |
pkg/httpclient |
HTTP client for the go-llm API |
pkg/httphandler |
HTTP handler layer for the REST API server |
pkg/mcp |
Model Context Protocol server (stdio JSON-RPC 2.0) |
pkg/ui |
Chat UI abstraction (bubbletea terminal UI, Telegram bot, shared command handler) |
| Interface | Description |
|---|---|
Client |
Provider connection — Name(), ListModels(), GetModel() |
Generator |
Text generation — WithoutSession() (stateless), WithSession() (stateful) |
Embedder |
Embedding generation — Embedding(), BatchEmbedding() |
Downloader |
Model management — DownloadModel(), DeleteModel() |
git clone https://github.com/mutablelogic/go-llm.git
cd go-llm
makeBuild the client-only CLI (no server or Telegram support):
make llm-client| Target | Description |
|---|---|
make all |
Build all binaries |
make llm-client |
Build client-only CLI (-tags client) |
make docker |
Build Docker image |
make docker-push |
Push Docker image to GHCR |
make docker-version |
Print version tag |
make test |
Run all tests |
make unit-test |
Run unit tests |
make coverage-test |
Run tests with coverage |
make tidy |
Run go mod tidy |
make clean |
Remove build artifacts |
Cross-compilation is supported via OS and ARCH variables:
OS=linux ARCH=arm64 makePlease file issues and feature requests in GitHub Issues. Licensed under Apache 2.0.
