Budget Controls

Prevent runaway LLM loops with turn-based budgets.

Why Budgets Exist

The agentic loop works like this: the LLM reasons, calls tools, receives results, reasons again, calls more tools, and so on until it produces a final text answer. But what if it never produces that final answer?

Without a budget, an agent can:

Loop indefinitely, calling the same tool with slightly different arguments.
Burn through API credits or local compute on a single request.
Block a thread forever in a synchronous execution model.
Amplify errors: each failed tool call leads to another attempt, which fails again.

Budgets are the guardrail. They set a hard upper bound on how many times the LLM can be called within a single agent invocation.

Configuration

Set a budget with the budget {} DSL block inside your agent definition:

val agent = agent<String, String>("researcher") {
    model { ollama("qwen2.5:7b") }

    budget {
        maxTurns = 10
    }

    skills {
        skill<String, String>("research", "Research a topic using tools") {
            tools("search", "summarize")
            // ... tool definitions
        }
    }
}

BudgetConfig

The DSL produces a BudgetConfig data class:

data class BudgetConfig(
    val maxTurns: Int = Int.MAX_VALUE
)

The default is Int.MAX_VALUE -- effectively unlimited. In production, you should always set an explicit limit.

BudgetExceededException

When the agent reaches its turn limit, the framework throws BudgetExceededException:

import agents_engine.core.BudgetExceededException

try {
    val result = agent("Analyze all 10,000 files in the repository")
} catch (e: BudgetExceededException) {
    println("Agent ran out of turns: ${e.message}")
    // Handle gracefully: return partial result, notify user, etc.
}

The exception is thrown before the next LLM call would happen. This means:

All previous tool calls have completed.
All previous LLM responses are intact.
The agent's message history is available up to the point of termination.

Catching in Pipelines

In a then pipeline, BudgetExceededException propagates like any other exception:

val pipeline = parse then analyze then summarize

try {
    pipeline(input)
} catch (e: BudgetExceededException) {
    // Which agent exceeded its budget? Check the message.
    println(e.message)  // "Agent 'analyze' exceeded budget of 10 turns"
}

Counting Turns

A turn is one LLM request-response cycle. Here is how turns map to the agentic loop:

Turn 1: LLM receives [system, user] -> returns ToolCalls([search("kotlin agents")])
         Framework executes search, appends tool result

Turn 2: LLM receives [system, user, assistant(toolcalls), tool(result)] -> returns ToolCalls([summarize(...)])
         Framework executes summarize, appends tool result

Turn 3: LLM receives [system, user, assistant, tool, assistant, tool] -> returns Text("Here is the summary...")
         Done. 3 turns used.

Key points:

Each call to ModelClient.chat() is one turn.
Multiple tool calls in a single LLM response count as one turn (the LLM made one request that happened to include multiple tool calls).
The final Text response also counts as a turn.
Tool execution itself does not count -- only the LLM call does.

Example: Turn Counting

val agent = agent<String, String>("counter-demo") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 3 }

    skills {
        skill<String, String>("work", "Do work") {
            tools("step_a", "step_b")
            tool("step_a", "First step") { args -> "result_a" }
            tool("step_b", "Second step") { args -> "result_b" }
        }
    }
}

If the LLM's behavior is:

Turn 1: calls step_a and step_b together -> 1 turn
Turn 2: calls step_a again -> 1 turn
Turn 3: returns text "Done" -> 1 turn

Total: 3 turns. Exactly at the limit. If the LLM tried a 4th call, it would throw.

Best Practices

1. Always Set a Budget in Production

// Don't do this in production
val agent = agent<String, String>("risky") {
    model { ollama("qwen2.5:7b") }
    // No budget -- defaults to Int.MAX_VALUE
    // ...
}

// Do this instead
val agent = agent<String, String>("safe") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 15 }
    // ...
}

2. Budget by Task Complexity

Match your budget to the expected number of tool calls:

Task Type	Typical Turns	Suggested Budget
Single tool call + answer	2	3-5
Multi-step analysis (3-5 tools)	4-6	8-10
Complex research (many tools, iteration)	8-15	15-20
Open-ended exploration	10-30	25-30

Leave headroom above the expected turns. The LLM might need an extra turn to correct a mistake or rephrase its answer.

3. Separate Budgets for Nested Agents

When agents are composed via structure {}, each has its own budget. A parent agent with maxTurns = 10 does not share that budget with its children:

val researcher = agent<String, String>("researcher") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 20 }   // generous budget for deep research
    // ...
}

val summarizer = agent<String, String>("summarizer") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 3 }    // tight budget: should be quick
    // ...
}

val pipeline = researcher then summarizer
// researcher gets 20 turns, summarizer gets 3 -- independent

4. Use Low Budgets for Repair Agents

Tool Error Recovery repair agents should have tight budgets. A repair agent that loops is worse than the original error:

val jsonFixer = agent<String, String>("json-fixer") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 1 }    // single-shot: one LLM call, no tools
    // ...
}

5. Test Budget Boundaries

Write tests that verify your agent completes within its budget:

@Test
fun `agent completes within budget`() {
    var turnCount = 0
    val mockClient = ModelClient { messages ->
        turnCount++
        if (turnCount < 3) {
            LlmResponse.ToolCalls(listOf(ToolCall("step", emptyMap())))
        } else {
            LlmResponse.Text("done")
        }
    }

    val agent = agent<String, String>("test") {
        model { ollama("unused"); client = mockClient }
        budget { maxTurns = 5 }
        skills {
            skill<String, String>("work", "Work") {
                tools("step")
                tool("step", "A step") { "ok" }
            }
        }
    }

    val result = agent("go")
    assertEquals("done", result)
    assertEquals(3, turnCount)  // completed in 3 turns, well within budget of 5
}

@Test
fun `agent throws when budget exceeded`() {
    val mockClient = ModelClient { _ ->
        // Never returns Text -- always calls tools
        LlmResponse.ToolCalls(listOf(ToolCall("step", emptyMap())))
    }

    val agent = agent<String, String>("test") {
        model { ollama("unused"); client = mockClient }
        budget { maxTurns = 3 }
        skills {
            skill<String, String>("work", "Work") {
                tools("step")
                tool("step", "A step") { "ok" }
            }
        }
    }

    assertThrows<BudgetExceededException> {
        agent("go")
    }
}

6. Log Budget Usage

Combine budgets with Observability Hooks to track how many turns agents actually use:

var turns = 0

val agent = agent<String, String>("monitored") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 10 }

    onToolUse { name, args, result ->
        turns++
        println("Turn $turns: $name")
    }

    // ...
}

agent("input")
println("Total turns used: $turns")

This data helps you tune budgets over time. If an agent consistently uses 3 turns, a budget of 20 is wasteful -- tighten it to 5 to catch regressions early.

Next Steps

Model & Tool Calling -- understand the loop that budgets constrain
Tool Error Recovery -- error recovery interacts with budgets (retries consume turns)
Observability Hooks -- monitor budget usage in real time

Agents.KT Wiki

Getting Started

Core Concepts

Composition Operators

LLM Integration

Guided Generation

Agent Memory

MemoryBank

Reference

Budget Controls

Budget Controls

Why Budgets Exist

Configuration

BudgetConfig

BudgetExceededException

Catching in Pipelines

Counting Turns

Example: Turn Counting

Best Practices

1. Always Set a Budget in Production

2. Budget by Task Complexity

3. Separate Budgets for Nested Agents

4. Use Low Budgets for Repair Agents

5. Test Budget Boundaries

6. Log Budget Usage

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally