Skip to content

mutablelogic/go-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

181 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

go-llm

Go Reference License

A multi-provider LLM tool and SDK for Go.

Interact with large language models from Google Gemini, Anthropic Claude, and Mistral through a unified CLI, HTTP API server, or Go SDK. Supports stateless completion (ask), stateful multi-turn conversations (chat), embeddings, tool use, MCP integration, and a Telegram bot.

Features

  • Multi-Provider: Unified interface across Google Gemini, Anthropic Claude, and Mistral
  • CLI: Interactive terminal UI chat with markdown rendering, single-shot ask, and embedding generation
  • HTTP API Server: REST endpoints for models, sessions, tools, chat, ask, and embeddings with SSE streaming
  • Session Management: Stateful multi-turn conversations with persistent storage (file-backed JSON)
  • Tool Use: Extensible tool framework with built-in integrations for Home Assistant, NewsAPI, and WeatherAPI
  • MCP Server: Model Context Protocol (stdio JSON-RPC 2.0) server exposing registered tools and prompts
  • Telegram Bot: Full-featured bot with per-conversation sessions, file/image attachments, and slash commands
  • Attachments: Send files, images, and URLs as context in chat and ask
  • Structured Output: JSON schema-based structured output for ask
  • Thinking/Reasoning: Extended thinking support with configurable budgets (Anthropic, Gemini)
  • OpenTelemetry: Distributed tracing support for the HTTP API

Quick Start

Server

The server provides the REST API for all providers and tools. Run it with Docker:

docker volume create go-llm
docker run -d --name go-llm \
  -v go-llm:/data -p 8085:8085 \
  -e GEMINI_API_KEY="your-key" \
  ghcr.io/mutablelogic/llm run

Client

The client-only CLI can be downloaded from the releases page. It does not include the server or the Telegram bot. Point it at a running server:

export LLM_ADDR="localhost:8085"

# List providers and models
llm providers
llm models

# Single-shot completion
llm ask --model gemini-2.0-flash "Explain KV cache in two sentences"

# Interactive chat (launches terminal UI)
llm chat --model gemini-2.0-flash

# Single-shot chat with session persistence
llm chat --model gemini-2.0-flash "What is the capital of France?"

# Generate embeddings
llm embedding --model text-embedding-004 "Hello, world"

Telegram Bot

The Telegram bot runs as a sidecar to the server, providing chat access via Telegram. It requires a bot token from @BotFather:

docker run -d --name go-llm-telegram \
  -e LLM_ADDR="your-server:8085" \
  -e TELEGRAM_TOKEN="your-bot-token" \
  ghcr.io/mutablelogic/llm telegram --model gemini-2.0-flash

Server Configuration

Providers

API keys are configured on the server via flags or environment variables:

Provider Flag Env Variable Models
Google Gemini --gemini-api-key GEMINI_API_KEY Gemini 2.0 Flash, Flash Lite, embedding models, etc.
Anthropic Claude --anthropic-api-key ANTHROPIC_API_KEY Claude Sonnet, Haiku, Opus, etc.
Mistral --mistral-api-key MISTRAL_API_KEY Mistral Large, Small, embedding models, etc.

Tools

The following tools are included as examples of how to build tool integrations with the SDK:

Tool Flag Env Variable Description
Home Assistant --ha-endpoint, --ha-token HA_ENDPOINT, HA_TOKEN Query and control smart home devices
NewsAPI --news-api-key NEWS_API_KEY Search news articles and sources
WeatherAPI --weather-api-key WEATHER_API_KEY Current weather and forecasts

HTTP & TLS

Flag Env Variable Default Description
--http.addr LLM_ADDR localhost:8085 HTTP listen address
--http.prefix /api HTTP path prefix
--http.timeout 15m HTTP read/write timeout
--http.origin (same-origin) CORS origin (* to allow all)
--tls.name TLS server name
--tls.cert TLS certificate file
--tls.key TLS key file

Observability

Flag Env Variable Description
--otel.endpoint OTEL_EXPORTER_OTLP_ENDPOINT OpenTelemetry collector endpoint
--otel.header OTEL_EXPORTER_OTLP_HEADERS OpenTelemetry collector headers
--otel.name OTEL_SERVICE_NAME OpenTelemetry service name

Sessions and defaults are stored in $XDG_CACHE_HOME (or system temp if unset).

CLI Commands

The llm command-line tool can connect to any running go-llm server (set LLM_ADDR or --http.addr to point at it) and provides commands for generating text, managing models and sessions, inspecting tools, and running the server itself.

Generate

Command Description Example
ask Stateless single-shot completion llm ask --model gemini-2.0-flash --file "*.go" "Summarize this source code"
chat Stateful chat (terminal UI or single-shot) llm chat --model gemini-2.0-flash
embedding Generate embedding vectors llm embedding --model text-embedding-004 "text"

Model

Command Description Example
providers List available providers llm providers
models List models llm models
model Get model details llm model gemini-2.0-flash

Session

Command Description Example
sessions List sessions llm sessions
session Get session details llm session <id>
create-session Create a new session llm create-session gemini-2.0-flash
update-session Update session metadata llm update-session <id>
delete-session Delete a session llm delete-session <id>

Tool

Command Description Example
tools List registered tools llm tools
tool Get tool details llm tool <name>

Server

Command Description Example
run Start the HTTP API server llm run
telegram Run as a Telegram bot llm telegram --model gemini-2.0-flash

Use llm --help or llm <command> --help for full options.

Interactive Chat

The chat command launches a terminal UI with markdown rendering, streaming responses, and slash commands:

Interactive Chat

Slash Commands

Command Description
/model Switch model
/models List available models
/providers List providers
/session Show current session
/sessions List sessions
/name Rename current session
/label Set session label
/system Set system prompt
/thinking Toggle thinking/reasoning
/tools Toggle tool use
/file Attach a file
/url Attach a URL
/reset Reset conversation
/delete Delete current session
/help Show help

Development

Architecture

%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
    subgraph Clients["Clients"]
        direction TB
        CLI["`**cmd/llm**
        CLI Tool`"]
        TUI["`**pkg/ui/bubbletea**
        Terminal UI`"]
        SDK["`**pkg/httpclient**
        SDK Client`"]
        Telegram["`**pkg/ui/telegram**
        Telegram Bot`"]
    end

    subgraph Server["go-llm Server"]
        direction TB
        API["`**pkg/httphandler**
        REST API`"]
        Agent["`**pkg/agent**
        Manager`"]
        Sessions["`**pkg/session**
        Memory, File`"]
        Tools["`**pkg/tool**
        Tool Registry`"]
        MCP["`**pkg/mcp**
        MCP Server`"]
    end

    subgraph Providers["LLM Providers"]
        direction TB
        Gemini["`**pkg/provider/google**
        Google Gemini`"]
        Claude["`**pkg/provider/anthropic**
        Anthropic Claude`"]
        Mistral["`**pkg/provider/mistral**
        Mistral`"]
    end

    subgraph ToolImpls["Example Tools"]
        direction TB
        HA["`**pkg/homeassistant**
        Home Assistant`"]
        News["`**pkg/newsapi**
        NewsAPI`"]
        Weather["`**pkg/weatherapi**
        WeatherAPI`"]
    end

    CLI --> API
    TUI --> API
    SDK --> API
    Telegram --> API
    API --> Agent
    Agent --> Sessions
    Agent --> Tools
    Agent --> Gemini
    Agent --> Claude
    Agent --> Mistral
    Tools --> HA
    Tools --> News
    Tools --> Weather
    MCP --> Tools
Loading

SDK Example

The Go SDK provides interfaces for building LLM-powered applications:

package main

import (
    "context"
    "fmt"
    "log"

    llm "github.com/mutablelogic/go-llm"
    "github.com/mutablelogic/go-llm/pkg/provider/google"
)

func main() {
    ctx := context.Background()

    // Create a provider
    client, err := google.New(ctx, "your-api-key")
    if err != nil {
        log.Fatal(err)
    }

    // Get a model
    model, err := client.GetModel(ctx, "gemini-2.0-flash")
    if err != nil {
        log.Fatal(err)
    }

    // Stateless completion
    gen := model.(llm.Generator)
    response, err := gen.WithoutSession(ctx, "Explain KV cache in two sentences")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(response)
}

Key Packages

Package Description
pkg/agent Central manager orchestrating providers, sessions, and tools
pkg/provider/{google,anthropic,mistral} Provider implementations
pkg/session Session storage backends (in-memory, file-backed JSON)
pkg/tool Tool interface and toolkit registry
pkg/schema Core types (Model, Message, ContentBlock, Attachment, Session, etc.)
pkg/httpclient HTTP client for the go-llm API
pkg/httphandler HTTP handler layer for the REST API server
pkg/mcp Model Context Protocol server (stdio JSON-RPC 2.0)
pkg/ui Chat UI abstraction (bubbletea terminal UI, Telegram bot, shared command handler)

Core Interfaces

Interface Description
Client Provider connection — Name(), ListModels(), GetModel()
Generator Text generation — WithoutSession() (stateless), WithSession() (stateful)
Embedder Embedding generation — Embedding(), BatchEmbedding()
Downloader Model management — DownloadModel(), DeleteModel()

Building

git clone https://github.com/mutablelogic/go-llm.git
cd go-llm
make

Build the client-only CLI (no server or Telegram support):

make llm-client

Makefile Targets

Target Description
make all Build all binaries
make llm-client Build client-only CLI (-tags client)
make docker Build Docker image
make docker-push Push Docker image to GHCR
make docker-version Print version tag
make test Run all tests
make unit-test Run unit tests
make coverage-test Run tests with coverage
make tidy Run go mod tidy
make clean Remove build artifacts

Cross-compilation is supported via OS and ARCH variables:

OS=linux ARCH=arm64 make

Contributing & License

Please file issues and feature requests in GitHub Issues. Licensed under Apache 2.0.