Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/codemode-validators.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@cloudflare/codemode": minor
---

Add pluggable runtime validators for reviewing model-generated code before execution and concrete connector calls before they run. Validation failures return bounded, model-actionable diagnostics and fail closed without executing rejected actions.
1 change: 1 addition & 0 deletions docs/codemode/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,7 @@ sandbox: github.create_issue(args)

- [Connectors](./connectors.md) — write one class per service; MCP, OpenAPI, toolset, and custom bases
- [Runtime](./runtime.md) — both API surfaces (handle + sandbox SDK), the durable log, abort-and-replay
- [Validators](./validators.md) — reject semantically incorrect generated code and connector calls
- [Approvals](./approvals.md) — annotations, pause/resume flow, wiring an approval UI
- [Snippets](./snippets.md) — scripts the model saves and reuses
- [Vite Plugin](./vite-plugin.md) — `*.codemode.ts` discovery and Worker-entry exports
2 changes: 2 additions & 0 deletions docs/codemode/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ const runtime = createCodemodeRuntime({
});
```

Add `validators` to reject semantically incorrect generated programs before the executor starts or concrete connector calls before they execute. See [Validators](./validators.md).

| Handle method | Purpose |
| ---------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `runtime.tool(options?)` | The single model-facing AI SDK tool, `codemode({ code })` |
Expand Down
151 changes: 151 additions & 0 deletions docs/codemode/validators.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# Validators

Codemode validators let your application evaluate model-generated code before it runs. Validators are general host-side hooks: they can call a policy engine, run static analysis, ask a model to review the program, or apply application-specific rules.

Add validators to `createCodemodeRuntime`:

```ts
import {
createCodemodeRuntime,
type CodemodeValidator
} from "@cloudflare/codemode";

const policyValidator: CodemodeValidator = {
name: "organization-policy",

async validateCode({ code }) {
const decision = await policyEngine.evaluate(code);

return decision.allowed
? { valid: true }
: {
valid: false,
issues: [
{
code: decision.code,
message: decision.reason ?? "Rejected by organization policy."
}
]
};
}
};

const runtime = createCodemodeRuntime({
ctx: this.ctx,
executor,
connectors,
validators: [policyValidator]
});
```

Validation is opt-in. With no validators configured, Codemode behaves as before. A validator that does not implement a particular hook does not participate at that validation point.

Every implemented hook must explicitly return `{ valid: true }` or `{ valid: false, issues? }`. Codemode fails closed if a configured hook returns nothing, returns malformed data, or throws. All participating validators must return valid before execution proceeds.

Codemode runs validators sequentially, collects issues from invalid results, and returns bounded, attributed feedback that the model can use to correct its code.

## Validate the generated program

`validateCode` runs before Codemode creates an execution or starts the executor. Its context contains:

- `code`: source exactly as the model supplied it;
- `normalizedCode`: source after Codemode strips fences and normalizes it to an async function;
- `connectors`: the configured connector descriptions, including methods and input schemas.

A validator can use as much or as little of this context as it needs:

```ts
const programValidator: CodemodeValidator = {
name: "program-review",

async validateCode({ code, normalizedCode, connectors }) {
const issues = await reviewGeneratedProgram({
request: currentUserRequest,
code,
normalizedCode,
availableMethods: connectors
});

return issues.length > 0 ? { valid: false, issues } : { valid: true };
}
};
```

An invalid result returns a `status: "error"` tool result with an empty execution ID. The executor does not start and the rejected program does not appear in the runtime's execution history.

If a program reviewer needs the original user request or messages, capture them in the validator closure when creating the runtime. Codemode does not prescribe a model or message representation for validators.

## Validate concrete connector calls

Use the optional `validateToolCall` hook when validation depends on evaluated arguments or current application state. It receives:

- `executionId`;
- connector and method names;
- concrete arguments;
- the method's input schema and annotations, when available.

For example, an application can reject a resource transition that is invalid for the resource's current state:

```ts
const lifecycleValidator: CodemodeValidator = {
name: "resource-lifecycle",

async validateToolCall({ connector, method, args }) {
if (connector !== "resources" || method !== "transition") {
return { valid: true };
}

const input = args as { id?: string; state?: string };
const resource = await loadResource(input.id);

if (resource.state === "deleted" && input.state === "active") {
return {
valid: false,
issues: [
{
code: "invalid-lifecycle-transition",
path: "state",
message: "A deleted resource cannot transition directly to active.",
suggestion: "Restore the resource before activating it."
}
]
};
}

return { valid: true };
}
};
```

Call validation runs after the durable runtime decides that the connector will execute, but before `connector.executeTool()`. An invalid result marks the execution as failed, and the connector action does not run. Generated code cannot catch the local error and continue to later connector side effects because the durable execution is already terminal.

Applied calls served from the durable replay log are not revalidated. Ephemeral calls marked `replay: "reexecute"` are validated each time because they execute again. Approval-required calls validate after approval, immediately before the connector action.

Validator implementations are reconstructed with the runtime on each request. Codemode records call-validator names on a paused execution and refuses to resume if one is missing, so an approval handler cannot accidentally bypass the call policy that guarded the original run. Code-only validators do not need to be reconstructed for resume because they already decided whether the stored program could begin.

## Validation issues

Issues are optional. An invalid result without issues uses a generic validation message. Include issues when the model can use the details to correct its program:

```ts
return {
valid: false,
issues: [
{
message: "ownerId and region contain values in the wrong fields.",
path: "body.ownerId",
code: "swapped-fields",
suggestion:
"Use the account ID for ownerId and the region code for region."
}
]
};
```

Validators reject code or calls; they do not transform them. Perform normalization explicitly in a connector if your API requires it. Keeping validation reject-only ensures that generated code, approval data, durable logs, and rollback arguments all describe the same call.

## Failure behavior

Configured validation hooks fail closed. If a hook throws or returns an invalid result, Codemode logs the original value or exception in the host and blocks the operation. The model receives a generic error that names the validator but does not include thrown details, which could contain private application data.

Validators should perform reads rather than side effects. Upstream APIs should still enforce transactional invariants because validation cannot prevent state from changing between a check and the eventual remote operation.
7 changes: 7 additions & 0 deletions packages/codemode/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,13 @@ export {
type JsonSchemaToolDescriptors
} from "./json-schema-types";
export { normalizeCode } from "./normalize";
export {
type CodemodeValidationIssue,
type CodemodeValidationResult,
type CodeValidationContext,
type ToolCallValidationContext,
type CodemodeValidator
} from "./validation";
export { resolveProvider } from "./resolve";
export {
truncateResponse,
Expand Down
76 changes: 70 additions & 6 deletions packages/codemode/src/proxy-tool.ts
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,12 @@ import {
} from "./runtime";
import type { Snippet, SaveSnippetOptions } from "./snippet";
import type { CodeOutput } from "./shared";
import {
runCodeValidators,
runToolCallValidators,
toolCallValidatorNames,
type CodemodeValidator
} from "./validation";

// Connector annotations, flattened to "connector.method" → annotation.
type AnnotationMap = Record<string, ToolAnnotations>;
Expand Down Expand Up @@ -157,6 +163,8 @@ export type CreateProxyToolOptions = {
maxExecutions?: number;
/** Optionally reshape the model-facing result (e.g. truncate). */
transformResult?: TransformResult;
/** Host-side validators for generated code and concrete connector calls. */
validators?: readonly CodemodeValidator[];
};

// ---------------------------------------------------------------------------
Expand Down Expand Up @@ -336,7 +344,8 @@ function buildConnectorBindings(
setup: Setup,
runtime: RuntimeStub,
executionId: string,
cursor: Cursor
cursor: Cursor,
validators?: readonly CodemodeValidator[]
): ConnectorBinding[] {
return setup.descriptions.map((desc) => ({
name: desc.name,
Expand Down Expand Up @@ -366,6 +375,22 @@ function buildConnectorBindings(
if (decision.kind === "replay") return decision.result;
if (decision.kind === "pause") return { [CONTROL_KEY]: "pause" };

const validationError = await runToolCallValidators(validators, {
executionId,
connector: desc.name,
method,
args,
inputSchema: desc.descriptors[method]?.inputSchema,
annotations: annotation
});
if (validationError) {
await runtime.fail(executionId, validationError);
return {
[CONTROL_KEY]: "error",
message: validationError
};
}

const connector = setup.connectorsByName.get(desc.name);
if (!connector) throw new Error(`Unknown connector: ${desc.name}`);
const result = await connector.executeTool(method, args, {
Expand Down Expand Up @@ -548,10 +573,17 @@ async function runPass(
setup: Setup,
runtime: RuntimeStub,
executor: Executor,
transformResult?: TransformResult
transformResult?: TransformResult,
validators?: readonly CodemodeValidator[]
): Promise<ProxyToolOutput> {
const cursor = createCursor();
const bindings = buildConnectorBindings(setup, runtime, executionId, cursor);
const bindings = buildConnectorBindings(
setup,
runtime,
executionId,
cursor,
validators
);
const platformProvider = createPlatformProvider(
setup,
bindings,
Expand Down Expand Up @@ -717,17 +749,31 @@ export function createProxyTool(
};
}
const setup = await getSetup();
const validationError = await runCodeValidators(options.validators, {
code,
normalizedCode: normalizeCode(code),
connectors: setup.descriptions
});
if (validationError) {
return {
status: "error",
executionId: "",
error: validationError
};
}
const executionId = await runtime.begin(code, {
maxExecutions: options.maxExecutions,
connectors: connectors.map((c) => c.name())
connectors: connectors.map((c) => c.name()),
validators: toolCallValidatorNames(options.validators)
});
return runPass(
executionId,
code,
setup,
runtime,
options.executor,
options.transformResult
options.transformResult,
options.validators
);
}
});
Expand Down Expand Up @@ -796,6 +842,8 @@ export type ResumeCodemodeOptions = {
maxExecutions?: number;
/** Optionally reshape the model-facing result (e.g. truncate). */
transformResult?: TransformResult;
/** Host-side validators used by the original runtime configuration. */
validators?: readonly CodemodeValidator[];
};

/** Connectors an execution/snippet recorded but the runtime no longer has. */
Expand Down Expand Up @@ -837,6 +885,21 @@ export async function resumeCodemode(
`configured on this runtime.`
};
}

const missingValidators = missingConnectors(
existing.validators,
new Set(toolCallValidatorNames(options.validators))
);
if (missingValidators.length > 0) {
return {
status: "error",
executionId: options.executionId,
error:
`Execution "${options.executionId}" requires validator(s) ` +
`${missingValidators.map((name) => `"${name}"`).join(", ")} that ` +
`are not configured on this runtime.`
};
}
}

const execution = await runtime.resume(options.executionId);
Expand All @@ -859,7 +922,8 @@ export async function resumeCodemode(
setup,
runtime,
options.executor,
options.transformResult
options.transformResult,
options.validators
);
}

Expand Down
14 changes: 12 additions & 2 deletions packages/codemode/src/runtime-handle.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import {
} from "./proxy-tool";
import type { ExecutionState, PendingAction } from "./runtime";
import type { SaveSnippetOptions, Snippet } from "./snippet";
import { validateValidators, type CodemodeValidator } from "./validation";

export type CreateCodemodeRuntimeOptions = {
ctx: DurableObjectState;
Expand Down Expand Up @@ -44,6 +45,12 @@ export type CreateCodemodeRuntimeOptions = {
* Applies to both the initial run and a resume after approval.
*/
transformResult?: TransformResult;
/**
* Host-side validators for generated code and concrete connector calls.
* Validator implementations are transient request objects. Call-validator
* names are recorded so a paused execution cannot resume without them.
*/
validators?: readonly CodemodeValidator[];
};

export type CodemodeRuntimeToolOptions = {
Expand Down Expand Up @@ -127,6 +134,7 @@ class DefaultCodemodeRuntimeHandle implements CodemodeRuntimeHandle {

constructor(options: CreateCodemodeRuntimeOptions) {
validateConnectorNames(options.connectors);
validateValidators(options.validators);
this.#options = options;
}

Expand All @@ -141,7 +149,8 @@ class DefaultCodemodeRuntimeHandle implements CodemodeRuntimeHandle {
description: options?.description,
connectorHints: options?.connectorHints,
maxExecutions: this.#options.maxExecutions,
transformResult: this.#options.transformResult
transformResult: this.#options.transformResult,
validators: this.#options.validators
});
}

Expand All @@ -153,7 +162,8 @@ class DefaultCodemodeRuntimeHandle implements CodemodeRuntimeHandle {
name: this.#options.name,
executionId: options.executionId,
maxExecutions: this.#options.maxExecutions,
transformResult: this.#options.transformResult
transformResult: this.#options.transformResult,
validators: this.#options.validators
});
}

Expand Down
Loading
Loading