go-llm

A multi-provider LLM tool and SDK for Go.

Interact with large language models from Google Gemini, Anthropic Claude, and Mistral through a unified CLI, HTTP API server, or Go SDK. Supports stateless completion (ask), stateful multi-turn conversations (chat), embeddings, tool use, MCP integration, and a Telegram bot.

Features

Multi-Provider: Unified interface across Google Gemini, Anthropic Claude, and Mistral
CLI: Interactive terminal UI chat with markdown rendering, single-shot ask, and embedding generation
HTTP API Server: REST endpoints for models, sessions, tools, chat, ask, and embeddings with SSE streaming
Session Management: Stateful multi-turn conversations with persistent storage (file-backed JSON)
Tool Use: Extensible tool framework with built-in integrations for Home Assistant, NewsAPI, and WeatherAPI
MCP Server: Model Context Protocol (stdio JSON-RPC 2.0) server exposing registered tools and prompts
Telegram Bot: Full-featured bot with per-conversation sessions, file/image attachments, and slash commands
Attachments: Send files, images, and URLs as context in chat and ask
Structured Output: JSON schema-based structured output for ask
Thinking/Reasoning: Extended thinking support with configurable budgets (Anthropic, Gemini)
OpenTelemetry: Distributed tracing support for the HTTP API

Quick Start

Server

The server provides the REST API for all providers and tools. Run it with Docker:

docker volume create go-llm
docker run -d --name go-llm \
  -v go-llm:/data -p 8085:8085 \
  -e GEMINI_API_KEY="your-key" \
  ghcr.io/mutablelogic/llm run

Client

The client-only CLI can be downloaded from the releases page. It does not include the server or the Telegram bot. Point it at a running server:

export LLM_ADDR="localhost:8085"

# List providers and models
llm providers
llm models

# Single-shot completion
llm ask --model gemini-2.0-flash "Explain KV cache in two sentences"

# Interactive chat (launches terminal UI)
llm chat --model gemini-2.0-flash

# Single-shot chat with session persistence
llm chat --model gemini-2.0-flash "What is the capital of France?"

# Generate embeddings
llm embedding --model text-embedding-004 "Hello, world"

Telegram Bot

The Telegram bot runs as a sidecar to the server, providing chat access via Telegram. It requires a bot token from @BotFather:

docker run -d --name go-llm-telegram \
  -e LLM_ADDR="your-server:8085" \
  -e TELEGRAM_TOKEN="your-bot-token" \
  ghcr.io/mutablelogic/llm telegram --model gemini-2.0-flash

Server Configuration

Providers

API keys are configured on the server via flags or environment variables:

Provider	Flag	Env Variable	Models
Google Gemini	`--gemini-api-key`	`GEMINI_API_KEY`	Gemini 2.0 Flash, Flash Lite, embedding models, etc.
Anthropic Claude	`--anthropic-api-key`	`ANTHROPIC_API_KEY`	Claude Sonnet, Haiku, Opus, etc.
Mistral	`--mistral-api-key`	`MISTRAL_API_KEY`	Mistral Large, Small, embedding models, etc.

Tools

The following tools are included as examples of how to build tool integrations with the SDK:

Tool	Flag	Env Variable	Description
Home Assistant	`--ha-endpoint`, `--ha-token`	`HA_ENDPOINT`, `HA_TOKEN`	Query and control smart home devices
NewsAPI	`--news-api-key`	`NEWS_API_KEY`	Search news articles and sources
WeatherAPI	`--weather-api-key`	`WEATHER_API_KEY`	Current weather and forecasts

HTTP & TLS

Flag	Env Variable	Default	Description
`--http.addr`	`LLM_ADDR`	`localhost:8085`	HTTP listen address
`--http.prefix`	—	`/api`	HTTP path prefix
`--http.timeout`	—	`15m`	HTTP read/write timeout
`--http.origin`	—	(same-origin)	CORS origin (`*` to allow all)
`--tls.name`	—	—	TLS server name
`--tls.cert`	—	—	TLS certificate file
`--tls.key`	—	—	TLS key file

Observability

Flag	Env Variable	Description
`--otel.endpoint`	`OTEL_EXPORTER_OTLP_ENDPOINT`	OpenTelemetry collector endpoint
`--otel.header`	`OTEL_EXPORTER_OTLP_HEADERS`	OpenTelemetry collector headers
`--otel.name`	`OTEL_SERVICE_NAME`	OpenTelemetry service name

Sessions and defaults are stored in $XDG_CACHE_HOME (or system temp if unset).

CLI Commands

The llm command-line tool can connect to any running go-llm server (set LLM_ADDR or --http.addr to point at it) and provides commands for generating text, managing models and sessions, inspecting tools, and running the server itself.

Generate

Command	Description	Example
`ask`	Stateless single-shot completion	`llm ask --model gemini-2.0-flash --file "*.go" "Summarize this source code"`
`chat`	Stateful chat (terminal UI or single-shot)	`llm chat --model gemini-2.0-flash`
`embedding`	Generate embedding vectors	`llm embedding --model text-embedding-004 "text"`

Model

Command	Description	Example
`providers`	List available providers	`llm providers`
`models`	List models	`llm models`
`model`	Get model details	`llm model gemini-2.0-flash`

Session

Command	Description	Example
`sessions`	List sessions	`llm sessions`
`session`	Get session details	`llm session <id>`
`create-session`	Create a new session	`llm create-session gemini-2.0-flash`
`update-session`	Update session metadata	`llm update-session <id>`
`delete-session`	Delete a session	`llm delete-session <id>`

Tool

Command	Description	Example
`tools`	List registered tools	`llm tools`
`tool`	Get tool details	`llm tool <name>`

Server

Command	Description	Example
`run`	Start the HTTP API server	`llm run`
`telegram`	Run as a Telegram bot	`llm telegram --model gemini-2.0-flash`

Use llm --help or llm <command> --help for full options.

Interactive Chat

The chat command launches a terminal UI with markdown rendering, streaming responses, and slash commands:

Slash Commands

Command	Description
`/model`	Switch model
`/models`	List available models
`/providers`	List providers
`/session`	Show current session
`/sessions`	List sessions
`/name`	Rename current session
`/label`	Set session label
`/system`	Set system prompt
`/thinking`	Toggle thinking/reasoning
`/tools`	Toggle tool use
`/file`	Attach a file
`/url`	Attach a URL
`/reset`	Reset conversation
`/delete`	Delete current session
`/help`	Show help

Development

Architecture

%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
    subgraph Clients["Clients"]
        direction TB
        CLI["`**cmd/llm**
        CLI Tool`"]
        TUI["`**pkg/ui/bubbletea**
        Terminal UI`"]
        SDK["`**pkg/httpclient**
        SDK Client`"]
        Telegram["`**pkg/ui/telegram**
        Telegram Bot`"]
    end

    subgraph Server["go-llm Server"]
        direction TB
        API["`**pkg/httphandler**
        REST API`"]
        Agent["`**pkg/agent**
        Manager`"]
        Sessions["`**pkg/session**
        Memory, File`"]
        Tools["`**pkg/tool**
        Tool Registry`"]
        MCP["`**pkg/mcp**
        MCP Server`"]
    end

    subgraph Providers["LLM Providers"]
        direction TB
        Gemini["`**pkg/provider/google**
        Google Gemini`"]
        Claude["`**pkg/provider/anthropic**
        Anthropic Claude`"]
        Mistral["`**pkg/provider/mistral**
        Mistral`"]
    end

    subgraph ToolImpls["Example Tools"]
        direction TB
        HA["`**pkg/homeassistant**
        Home Assistant`"]
        News["`**pkg/newsapi**
        NewsAPI`"]
        Weather["`**pkg/weatherapi**
        WeatherAPI`"]
    end

    CLI --> API
    TUI --> API
    SDK --> API
    Telegram --> API
    API --> Agent
    Agent --> Sessions
    Agent --> Tools
    Agent --> Gemini
    Agent --> Claude
    Agent --> Mistral
    Tools --> HA
    Tools --> News
    Tools --> Weather
    MCP --> Tools

SDK Example

The Go SDK provides interfaces for building LLM-powered applications:

package main

import (
    "context"
    "fmt"
    "log"

    llm "github.com/mutablelogic/go-llm"
    "github.com/mutablelogic/go-llm/pkg/provider/google"
)

func main() {
    ctx := context.Background()

    // Create a provider
    client, err := google.New(ctx, "your-api-key")
    if err != nil {
        log.Fatal(err)
    }

    // Get a model
    model, err := client.GetModel(ctx, "gemini-2.0-flash")
    if err != nil {
        log.Fatal(err)
    }

    // Stateless completion
    gen := model.(llm.Generator)
    response, err := gen.WithoutSession(ctx, "Explain KV cache in two sentences")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(response)
}

Key Packages

Package	Description
`pkg/agent`	Central manager orchestrating providers, sessions, and tools
`pkg/provider/{google,anthropic,mistral}`	Provider implementations
`pkg/session`	Session storage backends (in-memory, file-backed JSON)
`pkg/tool`	Tool interface and toolkit registry
`pkg/schema`	Core types (Model, Message, ContentBlock, Attachment, Session, etc.)
`pkg/httpclient`	HTTP client for the go-llm API
`pkg/httphandler`	HTTP handler layer for the REST API server
`pkg/mcp`	Model Context Protocol server (stdio JSON-RPC 2.0)
`pkg/ui`	Chat UI abstraction (bubbletea terminal UI, Telegram bot, shared command handler)

Core Interfaces

Interface	Description
`Client`	Provider connection — `Name()`, `ListModels()`, `GetModel()`
`Generator`	Text generation — `WithoutSession()` (stateless), `WithSession()` (stateful)
`Embedder`	Embedding generation — `Embedding()`, `BatchEmbedding()`
`Downloader`	Model management — `DownloadModel()`, `DeleteModel()`

Building

git clone https://github.com/mutablelogic/go-llm.git
cd go-llm
make

Build the client-only CLI (no server or Telegram support):

make llm-client

Makefile Targets

Target	Description
`make all`	Build all binaries
`make llm-client`	Build client-only CLI (`-tags client`)
`make docker`	Build Docker image
`make docker-push`	Push Docker image to GHCR
`make docker-version`	Print version tag
`make test`	Run all tests
`make unit-test`	Run unit tests
`make coverage-test`	Run tests with coverage
`make tidy`	Run `go mod tidy`
`make clean`	Remove build artifacts

Cross-compilation is supported via OS and ARCH variables:

OS=linux ARCH=arm64 make

Contributing & License

Please file issues and feature requests in GitHub Issues. Licensed under Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
.github/workflows		.github/workflows
cmd/llm		cmd/llm
etc		etc
pkg		pkg
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
client.go		client.go
error.go		error.go
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

go-llm

Features

Quick Start

Server

Client

Telegram Bot

Server Configuration

Providers

Tools

HTTP & TLS

Observability

CLI Commands

Generate

Model

Session

Tool

Server

Interactive Chat

Slash Commands

Development

Architecture

SDK Example

Key Packages

Core Interfaces

Building

Makefile Targets

Contributing & License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Languages

License

mutablelogic/go-llm

Folders and files

Latest commit

History

Repository files navigation

go-llm

Features

Quick Start

Server

Client

Telegram Bot

Server Configuration

Providers

Tools

HTTP & TLS

Observability

CLI Commands

Generate

Model

Session

Tool

Server

Interactive Chat

Slash Commands

Development

Architecture

SDK Example

Key Packages

Core Interfaces

Building

Makefile Targets

Contributing & License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Languages

Packages