Studying the gap between what agents know and when they act on it.
-
Updated
Mar 29, 2026 - TypeScript
Studying the gap between what agents know and when they act on it.
Canonical home for AI Behavior Science research and the Founding Territory Paper
RL-style eval measuring intent/action divergence in frontier agents: model acknowledges a correction, then acts on the stale value anyway. 3 scenarios, 371 trials on claude-haiku-4-5, Sonnet 4.6, GPT-5.4, and Gemini 3.1 Pro Preview.
High-performance routing engine that selects the best agent skill for a task and emits structured handoff decisions.
Produces auditable token-usage and cost reports from runtime evidence, normalized usage bundles, and repository-level report sets.
Audits frontend implementations for design-system drift across CSS, Tailwind, JSX, TSX, Vue, and Angular code.
Manages durable cross-agent shared memory for stable conventions, reusable policies, and organization-wide operating rules.
Reviews and modernizes stacks, packages, SDKs, and tooling before code is written against them.
Scores and improves prompts for clarity, consistency, signal density, structure, and runtime fit.
MCP server for AI agent research — captures LLM reasoning, model identity, and feedback via schema injection
An easy-to-integrate Unity FSM for basic enemy AI behaviors, utilizing ScriptableObject for customizable and reusable AI states like Idle, Chase, and Attack.
Audits APIs against OpenAPI, AsyncAPI, JSON Schema, protobuf, or PRD contracts to catch drift before release.
Add a description, image, and links to the agent-behavior topic page so that developers can more easily learn about it.
To associate your repository with the agent-behavior topic, visit your repo's landing page and select "manage topics."