Model output is a claim, not a fact.
AiNIR is a v1.0 release-candidate public demo of a semantic trust layer for inspecting AI-generated program semantics before they are lowered, handed off, or executed by a host runtime.
A model may propose a workflow. AiNIR asks whether that proposal is trustworthy enough to move forward: Are effects declared? Are capabilities minimal? Is the evidence ledger-bound? Is the runtime context trusted? Are transaction boundaries explicit? Can the draft be lowered safely?
Created by Lee Yoon Kyu under [AIOE].
AiNIR는 AI가 만든 workflow 출력을 곧바로 사실이나 실행 가능한 명령으로 보지 않습니다. 그 출력이 행동으로 넘어가기 전에 충분한 근거, semantic contract, capability 경계, 신뢰할 수 있는 실행 맥락을 갖추었는지 검사하는 semantic trust layer입니다.
간단히 말하면, AiNIR는 "모델이 그럴듯한 답을 했는가?"보다 "그 답을 실행이나 handoff로 넘겨도 될 만큼 근거가 갖춰졌는가?"를 묻습니다.
이 공개 저장소는 v1.0 RC candidate 공개 데모입니다. v1.0 final이나 production runtime은 아니며, private review archive와 전체 연구 기록은 포함하지 않습니다.
AiNIR is a compact public demo of a trust boundary for AI-generated program semantics, now packaged as a v1.0 RC candidate for bounded public review and external evaluation.
flowchart TD
A["AI-generated draft"] --> B["Strict Draft AST"]
B --> C["Safety Registry"]
C --> D["Evidence Ledger"]
D --> E["Operation / Effect / Capability Gates"]
E --> F["Trusted Context + Transaction Binding"]
F --> G["Trust Gate"]
G --> H{"Decision"}
H -->|passed| I["TrustReceipt"]
H -->|refused / invalid| J["Refusal Report"]
I --> K["Replay Check"]
I --> L["Lowering Eligibility"]
L --> M["Host Enforcement Skeleton"]
Pipeline summary:
| Stage | What AiNIR checks |
|---|---|
| AI-generated draft | Treats model output as a semantic claim, not executable truth. |
| Strict Draft AST | Rejects malformed or prose-shaped workflow drafts. |
| Registries and evidence | Checks safety registry, evidence ledger, operation specs, effects, and capabilities. |
| Trusted context and transaction binding | Uses host-provided context and explicit transaction boundaries. |
| Trust Gate decision | Issues a pass/refusal decision with failed gates and reasons. |
| TrustReceipt and replay | Records replayable decisions before lowering eligibility. |
| Lowering skeleton | Allows only safe, lowerable workflows to produce a host-enforcement TypeScript skeleton. |
The public demo includes drafts that should be refused and one safe workflow that can be lowered into a host-enforcement TypeScript skeleton.
| Example | What it checks | Expected |
|---|---|---|
password_reset_raw_token_blocked |
synthetic secret persistence marker | refused |
order_payment_real_payment_blocked |
irreversible financial-effect marker | refused |
pii_export_raw_pii_blocked |
unprotected PII marker | refused |
account_deletion_hard_delete_blocked |
irreversible deletion marker | refused |
create_user_outbox_safe |
transaction-bound outbox pattern | passed + lowerable |
This repository is a v1.0 RC candidate public demo.
It is not a v1.0 final. It is not a production runtime. It remains a conservative pre-v1-to-v1 transition package while external review and final scope governance remain pending.
It is not:
- a v1.0 final release;
- a production compiler;
- a production execution runtime;
- a replacement for host-level security controls;
- the full private research archive.
The larger private review package, extended workflow suite, private reports, and enterprise policy packs are intentionally not included here.
AiNIR does not claim to verify arbitrary AI-generated code semantics today.
This public RC candidate demonstrates a narrower, testable claim:
- model-generated workflow drafts are treated as semantic claims;
- known workflow profiles are checked against registered evidence, effects, capabilities, operation contracts, trusted context, transaction boundaries, and lowering gates;
- unknown workflows are refused instead of guessed;
- every public pass/refusal path is covered by negative conformance cases, golden traces, and TrustReceipt replay.
The longer-term infrastructure path is profile-based: workflow profiles, canonical effect contracts, external evidence providers, registry versioning, and consumer conformance packs. AiNIR becomes more useful by making those profiles governable, not by pretending that one demo registry covers every enterprise workflow.
The public RC candidate is intentionally closed-world. It recognizes a small workflow registry and refuses unknown workflows instead of guessing their semantics.
This is a demo-safety boundary, not a claim that AiNIR can verify every
enterprise workflow today. Production use would require workflow registry
governance, external evidence providers, canonical effect taxonomies, registry
snapshot management, and profile-specific conformance packs. See
docs/positioning_and_scope.md,
docs/v1_known_limitations.md, and
docs/v1_roadmap.md.
Run from the repository root. The demo writes reports to your OS temp directory so the checkout stays clean.
macOS / Linux
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python -m ainir demo --out-dir "${TMPDIR:-/tmp}/ainir_demo_results"Windows PowerShell
python -m venv .venv
. .venv\Scripts\Activate.ps1
pip install -e ".[dev]"
python -m ainir demo --out-dir "$env:TEMP\ainir_demo_results"Expected result:
AiNIR public demo: passed
- account_deletion_hard_delete_blocked: blocked (10 critical)
- create_user_outbox_safe: passed (0 critical)
- order_payment_real_payment_blocked: blocked (15 critical)
- password_reset_raw_token_blocked: blocked (11 critical)
- pii_export_raw_pii_blocked: blocked (17 critical)
Run the local release-readiness simulation:
python scripts/run_phase26_private_trial.pyRun focused checks. On Windows PowerShell, replace /tmp/... with
$env:TEMP\..., or set AINIR_TEMP_ROOT before running scripts.
python -m ainir negative-conformance-eval --out-dir /tmp/ainir_negative_conformance
python -m ainir golden-trace-eval --out-dir /tmp/ainir_golden_traces
python scripts/run_prelaunch_check.py
python scripts/run_release_candidate_review.pyLower the safe outbox example:
python -m ainir lower examples/create_user_outbox_safe/draft.yaml \
--out-dir /tmp/ainir_lowering_checkOn Windows PowerShell:
python -m ainir lower examples/create_user_outbox_safe/draft.yaml `
--out-dir "$env:TEMP\ainir_lowering_check"AiNIR is not a JSON schema validator.
A schema can check whether a model output has the right shape. AiNIR checks whether the claimed program semantics are eligible to move toward lowering or handoff.
That means AiNIR looks beyond field presence. It checks evidence bindings, safety-critical effects, capability contracts, operation specs, trusted execution context, transaction boundaries, lowering eligibility, and replayable trust receipts.
AiNIR is a step toward warranted AI execution: deciding when a model-generated claim has enough evidence, semantic contract support, capability discipline, trusted context, and transaction binding to move toward handoff or execution.
Guardrails often ask whether an output is allowed. Evals often ask whether an output was good. AiNIR asks whether an AI-generated semantic claim is warranted enough to become an action.
In this public RC candidate, that warrant remains bounded to the registered demo workflows, bundled evidence, and explicit non-production scope.
AiNIR sits near runtime verification for LLM agents, evidence-gated execution, semantic contracts for AI agents, pre-execution validation, agent governance, tool contracts, policy-as-code, and workflow conformance testing.
It differs from output guardrails and general evals by focusing on the claim-to-action boundary: whether a model-generated workflow claim is supported well enough to move toward lowering, handoff, or execution.
Unsafe drafts do not become executable artifacts. A refused draft produces a decision artifact like this:
{
"status": "refused",
"executable": false,
"lowering_allowed": false,
"handoff_allowed": false,
"failed_gates": [
"evidence_ledger",
"capability_contract"
],
"reasons": [
{
"rule_id": "EVIDENCE_SELF_ATTESTED",
"severity": "critical",
"message": "Verified claims require ledger-bound evidence."
}
]
}A passed decision can issue a TrustReceipt. The receipt can be replayed
against the same draft, registry, context, and verifier report to check that the
decision is reproducible.
- Strict intake: malformed YAML, prose-shaped sections, hidden fields, and undeclared operation effects are refused.
- Evidence discipline: verified claims must bind to the bundled evidence ledger; draft self-attestation is not enough.
- Operation contracts: workflow roles must come from registered operation specs, not keyword guesses.
- Effect and capability boundaries: operations cannot add effects or capabilities outside their contract.
- Trusted context:
draft.environmentis untrusted metadata; policy evaluation uses host-provided context. - Transaction binding: transaction-required workflows must declare ordered, contiguous transaction boundaries.
- Lowering gate: blocked, invalid, stale, or hole-containing drafts cannot lower.
- TrustReceipt replay: trust decisions can be replayed against the same draft, registry, context, and verifier report.
AiNIR includes an optional VerifiedIntentPacket export surface for future
verified-intent consumers. In this public demo it is a contract slot, not an
integration.
The export surface must not add meaning that AiNIR did not verify. Concrete downstream schema grounding remains a consumer obligation.
Start with:
START_HERE.mddocs/README.mddocs/v1_rc_candidate.mddocs/v1_rc_scope.mddocs/pre_v1_status.mddocs/public_private_boundary.mddocs/trust_gate.mddocs/trust_receipt_persistence.mddocs/negative_conformance_corpus.mddocs/golden_traces.mddocs/verified_intent_packet.mddocs/workflow_registry_extension.mddocs/evidence_provider_interface.mddocs/effect_taxonomy_and_canonical_effects.mddocs/trust_receipt_registry_evolution.mddocs/v1_roadmap.md
For implementation history, see the phase-specific documents under docs/.
This repository is a conservative v1.0 RC candidate public demo. It is not a v1.0 final release, not a production runtime, and not a replacement for host-level security controls.
Public scope is intentionally bounded while external review and profile-governance work remain pending.
- Author / maintainer: Lee Yoon Kyu
- Organization / project studio: [AIOE]
- License: Apache-2.0
See AUTHORS.md, NOTICE, and docs/github_repo_settings.md before publishing.