Skip to content

GWVG/constrained-decoding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Constrained Decoding Function Calling

A constrained-decoding pipeline that turns prompts into typed function calls.

Description

This project implements a constrained-decoding pipeline for function calling. Given a user prompt and a fixed set of function signatures, the program selects the most appropriate function name and generates typed arguments using a small language model. The goal is to produce structured, machine-executable outputs while avoiding invalid tokens and malformed values.

Instructions

Installation

make install

Run

make run

Debug

make debug

Linting

make lint
make lint-strict

Algorithm explanation

The pipeline uses constrained decoding at two stages:

  1. Function selection: the model is prompted with a function list and the user request. Generation is constrained to token sequences that match known function names. At each step, only tokens that keep the prefix aligned with at least one function name are allowed, ensuring the result is always a valid function.
  2. Argument generation: for each argument, the model is prompted with the user request, function signature, and already-generated arguments. Token constraints enforce valid formats: digits and an optional minus for integers, digits plus an optional dot for floats, and a safe-token set for strings. The generator stops on JSON separators or closing braces and post-processes edge cases (e.g., regex balancing, name normalization for greetings).

Design decisions

  • Prefix-constrained name decoding ensures outputs are always one of the known function names.
  • Token-level constraints reduce invalid numeric outputs and malformed strings.
  • Small, explicit state object (AppState) centralizes LLM, functions, and token config to keep decoding logic simple.
  • Lightweight I/O layer keeps parsing and output writing testable and isolated.

Performance analysis

  • Accuracy: Function selection is high because outputs are restricted to valid names. Argument accuracy depends on prompt clarity and token constraints.
  • Speed: Decoding is efficient because constraints narrow the search space and each argument is generated independently.
  • Reliability: Guardrails (type constraints, safe tokens, post-processing) reduce malformed outputs. Some edge cases can still occur for ambiguous prompts.

Challenges faced

  • Handling ambiguous prompts where multiple functions are plausible.
  • Preventing malformed numeric outputs (e.g., stray characters or missing digits).
  • Managing safe string generation without blocking useful characters.

Testing strategy

  • Static checks: flake8 and mypy in strict mode (see Makefile).
  • Runtime validation: the pipeline is run against the provided test prompts and outputs are inspected for valid structure and types.
  • Manual spot checks: inspecting edge cases such as regex inputs and negative numbers.

Example usage

# Run with default input/output paths
make run

# Custom input/output
uv run python -m src --input data/input/function_calling_tests.json \
	--output data/output/function_calling_results.json

Resources

AI usage

AI assistance was used for:

  • Refactoring guidance (file structure and modularization).
  • Drafting docstrings and README sections.
  • Suggesting prompt improvements and error-handling strategies. All changes were reviewed and integrated manually.

About

A constrained-decoding pipeline that turns prompts into typed function calls.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages