Shared instructions for coding agents working in this repository.
Keep this file concise, concrete, and repo-specific. If guidance grows large, split it into referenced docs instead of turning this file into a handbook.
AGENTS.mdis a living document.- Update it in the same PR when repo-wide workflows, architecture, CI contracts, release processes, or durable coding defaults materially change.
- Do not edit this file for one-off task preferences.
- Keep this file as the canonical shared agent guide for this repository.
This repository contains the Langfuse Python SDK.
langfuse/_client/: core SDK, OpenTelemetry integration, resource management, decorators, datasetslangfuse/openai.py: OpenAI instrumentationlangfuse/langchain/: LangChain integrationlangfuse/_task_manager/: background consumers for media and score ingestionlangfuse/api/: generated Fern API client, do not hand-edittests/unit/: deterministic local tests, no Langfuse servertests/e2e/: real Langfuse-server teststests/live_provider/: live OpenAI / LangChain provider teststests/support/: shared helpers for e2e testsscripts/select_e2e_shard.py: CI shard selector fortests/e2escripts/codex/: Codex cloud/worktree bootstrap and shared quick checks
- Prefer small, targeted changes that preserve existing behavior.
- Do not weaken assertions just to make tests faster or greener.
- If a test is slow, first optimize setup, teardown, polling, or fixtures.
- Keep repo-shared instructions here. Keep personal or machine-specific notes out of version control.
- Keep tests independent and parallel-safe by default.
- For bug fixes, prefer writing or identifying the failing test first, confirm the failure, then implement the fix.
- For complex or ambiguous tasks, plan first, identify the likely verification path, then implement.
- Before final handoff, review the diff for correctness, regressions, missing tests, and accidental generated-file edits.
uv sync --locked
uv run pre-commit install
uv run --frozen ruff check .
uv run --frozen ruff format .
uv run --frozen mypy langfuse --no-error-summary
bash scripts/codex/quick-check.shUse the directory-based test split.
# Unit tests
uv run --frozen pytest -n auto --dist worksteal tests/unit
# All e2e tests that can run concurrently
uv run --frozen pytest -n 4 --dist worksteal tests/e2e -m "not serial_e2e"
# E2E tests that must run serially
uv run --frozen pytest tests/e2e -m "serial_e2e"
# Live provider tests
uv run --frozen pytest -n 4 --dist worksteal tests/live_provider -m "live_provider"
# Single test
uv run --frozen pytest tests/unit/test_resource_manager.py::test_pause_signals_score_consumer_shutdownMinimum verification matrix:
| Change scope | Minimum verification |
|---|---|
| Docs or comments only | uv run --frozen ruff format --check . if Python files changed |
| Python source only | uv run --frozen ruff check . + uv run --frozen mypy langfuse --no-error-summary + targeted unit tests |
| Unit-test-only change | targeted uv run --frozen pytest ... for the changed tests |
| Shutdown, flushing, worker-thread, or OTEL-heavy change | targeted resource-manager/OTEL tests plus affected integration tests when relevant |
| OpenAI or LangChain instrumentation | targeted unit tests using exporter-local assertions; add e2e/live-provider coverage only when unit tests cannot cover behavior |
| Generated API client or public API contract | upstream Fern/OpenAPI regeneration path plus targeted SDK serialization/deserialization tests |
| CI, sharding, or bootstrap | relevant script test plus CI workflow review against this file's CI contract |
- Must not require a running Langfuse server.
- Prefer in-memory exporters and local fakes over real network calls.
- If tracing behavior is under test, use the shared in-memory fixtures in
tests/conftest.py.
- Use for persisted backend behavior that genuinely needs a real Langfuse server.
- Prefer bounded polling helpers in
tests/support/over rawsleep()calls. - Use
serial_e2eonly for tests that are unsafe under shared-server concurrency. - New e2e files should be named
tests/e2e/test_*.py. - Do not add
e2e_core/e2e_datamarkers. CI shardstests/e2emechanically withscripts/select_e2e_shard.py.
- This suite uses real provider calls and always runs as one dedicated CI suite.
- Do not split or shard
tests/live_providerinto separate smoke and extended jobs unless the team explicitly changes that policy. - Keep assertions focused on stable provider-facing behavior rather than brittle observation counts.
The main CI workflow currently runs:
- linting on Python 3.13
- mypy on Python 3.13
tests/uniton a Python 3.10-3.14 matrixtests/e2ein 2 mechanical shards plus a serial subset inside each shardtests/live_provideras one always-on suite- PR title validation for Conventional Commits
If you change the e2e split:
- update
scripts/select_e2e_shard.py, not marker routing intests/conftest.py - make sure new
tests/e2e/test_*.pyfiles are automatically covered - keep
serial_e2eas the only scheduling-specific pytest marker
If you change CI bootstrap:
- preserve the
LANGFUSE_INIT_*startup path for the Langfuse server unless there is a strong reason to change it - preserve
cancel-in-progress: true
- Keep changes scoped. Avoid unrelated refactors.
- Prefer
LANGFUSE_BASE_URL;LANGFUSE_HOSTis deprecated and is only kept for compatibility tests. - If you touch
langfuse/api/, regenerate it from the upstream Fern/OpenAPI source instead of hand-editing files. - If you change public SDK behavior, update examples, README snippets, or generated reference docs when they would otherwise become stale.
- If you touch shutdown, flushing, or worker-thread behavior, run the relevant resource-manager and OTEL-heavy tests.
- If you change OpenAI or LangChain instrumentation, keep as much coverage as possible in
tests/unitusing exporter-local assertions, and leave only the minimal necessary coverage intests/e2e/tests/live_provider. - Never commit secrets or credentials.
- Keep
.env.templatein sync with required local-development environment variables.
- Commit messages and PR titles must follow Conventional Commits:
type(scope): descriptionortype: description. - Allowed common types include
feat,fix,docs,style,refactor,perf,test,build,ci,chore,revert, andsecurity. - Keep commits focused and atomic.
- Before opening a PR, self-review the diff and check
code_review.mdfor the repo-specific review checklist. - In PR descriptions, list the main verification commands you ran and call out any skipped checks with the reason.
- Exception messages should not inline f-string literals in the
raisestatement. Build the message in a variable first. - Prefer ASCII-only edits unless the file already uses Unicode or Unicode is clearly required.
uv build --no-sources
uv run --group docs pdoc -o docs/ --docformat google --logo "https://langfuse.com/langfuse_logo.svg" langfuseReleases are handled by GitHub Actions. Do not build an ad hoc local release flow into repository instructions.
- Prefer official documentation first when answering product or API questions.
- For OpenAI API, ChatGPT Apps SDK, or Codex questions, use the official OpenAI developer docs or Docs MCP server if available.
- Do not use destructive git commands such as
git reset --hardunless explicitly requested. - Do not revert unrelated working-tree changes.