Graph-native capability reachability analysis for agentic AI systems.
Agentic AI systems compose LLM planners, MCP servers, tools, retrieval systems, and external APIs into complex capability topologies. CogniGraph models these as a directed graph and answers a single question:
What dangerous capabilities can low-trust context reach?
It takes a YAML fixture describing your system's agents, tools, MCP servers, capabilities, and resources, builds a privilege graph, and runs detection rules to find dangerous paths, like untrusted web content reaching secret access, or a single agent having both secret-read and network-send capabilities.
- Python 3.11+
- uv (Python package manager)
- Docker (optional, for Neo4j integration)
# Clone and install
git clone <repo-url> && cd Hound-AI
uv sync
# Run the five-minute MVP demo
uv run cognigraph examples/rag_mcp_vulnerable.yaml --html-report report.html --findings-json findings.jsonThis loads a vulnerable RAG/MCP fixture, builds the graph, runs all detection rules, writes a static HTML report, and exports structured JSON findings. Open report.html to inspect the risky paths, node metadata, and recommended controls.
The CLI also prints a text report:
CogniGraph Analysis Report
==================================================
Total findings: 11
CRITICAL: 7
HIGH: 4
==================================================
--- Finding 1 ---
[R001] [CRITICAL] Low-trust context reaches critical capability
Low-trust context 'external_webpage' can reach capability 'SecretRead' (severity 4)
Path: external_webpage -> research_planner -> filesystem_tool -> SecretRead
Recommended control: Restrict low-trust context from reaching this capability...
...
The examples/ directory contains the core demo fixtures:
| Fixture | Purpose | Expected result |
|---|---|---|
examples/rag_mcp_vulnerable.yaml |
Low-trust web/RAG context reaches secret, repository, and network capabilities | R001, R002, R003, and R005 findings |
examples/rag_mcp_unannotated.yaml + examples/manual_tool_annotations.yaml |
Same topology as the vulnerable demo, with tool capabilities supplied through a separate annotation file | Same findings as the inline vulnerable fixture |
examples/least_privilege_safe.yaml |
Low-trust input can only reach low-severity documentation search over low-sensitivity public docs | No findings |
examples/overexposed_mcp.yaml |
Three agents can invoke one MCP-backed critical capability with threshold set to 3 |
R004 finding |
A safe fixture means no low-trust context can reach severity >= 3 capabilities, no low-trust context can reach sensitivity >= 3 resources, no agent can compose a configured dangerous capability pair, no MCP server exceeds the overexposure threshold, and no higher-trust agent consuming low-trust context has downstream critical capability access.
# Print findings report to stdout
uv run cognigraph examples/rag_mcp_vulnerable.yaml
# Export graph as Graphviz DOT (finding paths highlighted in red)
uv run cognigraph examples/rag_mcp_vulnerable.yaml --export-dot graph.dot
# Export graph as JSON
uv run cognigraph examples/rag_mcp_vulnerable.yaml --export-json graph.json
# Export findings as structured JSON
uv run cognigraph examples/rag_mcp_vulnerable.yaml --findings-json findings.json
# Apply manual tool capability annotations to a fixture
uv run cognigraph examples/rag_mcp_unannotated.yaml --annotations examples/manual_tool_annotations.yaml
# Preview: infer tool capabilities from tool IDs/descriptions with deterministic keyword rules
uv run cognigraph examples/rag_mcp_unannotated.yaml --infer-capabilities
# Export a static HTML report with finding paths and node metadata
uv run cognigraph examples/rag_mcp_vulnerable.yaml --html-report report.html
# Preview: overlay a CogniGraph Trace v1 file on the static graph
uv run cognigraph examples/rag_mcp_vulnerable.yaml --trace examples/runtime_trace_example.json --trace-format internal-json --html-report report.html
# Preview: overlay explicit CogniGraph attributes from OTLP-style JSON spans
uv run cognigraph fixtures/sample_fixture.yaml --trace fixtures/sample_otlp_trace.json --trace-format otlp-json --html-report report.html
# Preview: overlay standard OTel GenAI/MCP semantic-convention spans (no custom attributes needed)
uv run cognigraph collect fixtures/sample_mcp_config.json -o collected.yaml
uv run cognigraph collected.yaml --trace fixtures/sample_genai_trace.json --trace-format otlp-genai --html-report report.html
# Suppress stdout report (useful when only exporting)
uv run cognigraph examples/rag_mcp_vulnerable.yaml --quiet --export-dot graph.dot
# Render the DOT file to PNG (requires Graphviz installed)
dot -Tpng graph.dot -o graph.pngExit codes: 0 = no findings, 1 = error, 2 = findings detected.
Structured JSON findings include the rule ID, severity, path, entities, and deterministic recommended_control guidance.
Findings are grouped by (rule, target) — e.g. all paths by which low-trust context reaches SecretRead form one group, with the individual paths kept as evidence. Groups are ranked for triage (highest severity, then shortest path, then lowest source trust) in both the text summary and the HTML report's Finding Groups panel, which badges each group as active, mitigated, or suppressed.
Reviewed, accepted risks can be suppressed:
# suppressions.yaml
suppressions:
- rule_id: R001
target: SecretRead
reason: "Vault reads are gated by human approval (ticket SEC-142)"
expires: 2026-12-31 # optional; an expired entry is an erroruv run cognigraph my_system.yaml --suppressions suppressions.yamlSuppressed groups are reported under "Accepted Risks" and excluded from the exit-code decision. A suppression that matches no finding group, or has expired, fails with exit 1 — stale entries are treated as drift, not noise.
--fail-on {critical,high,any} (default any) sets the minimum active group severity that causes exit 2, which makes the analyzer usable as a CI gate:
# Fail the build only on critical, unsuppressed findings
uv run cognigraph my_system.yaml --suppressions suppressions.yaml --fail-on criticalThis repo's own CI workflow doubles as a template: it keeps a guarded fixture free of dangerous paths and verifies the vulnerable demo still detects.
CogniGraph has a small internal JSON trace format for runtime overlays. This is a preview feature, not a core MVP requirement. External trace systems should map into this format through adapters so the graph, rules, and report code can consume one stable representation.
The default adapter is internal-json, with compatibility aliases internal, cognigraph-trace-v1, and json.
{
"schema": "cognigraph.trace.v1",
"trace_id": "trace-001",
"session_id": "session-abc",
"source": {
"type": "internal-json",
"name": "CogniGraph example"
},
"events": [
{
"id": "event-001",
"timestamp": "2026-05-01T10:00:00Z",
"source_ref": {
"id": "external_webpage",
"type": "ContextSource",
"name": "External webpage"
},
"target_ref": {
"id": "research_planner",
"type": "Agent",
"name": "Research planner"
},
"edge_type": "PASSED_TO",
"status": "ok",
"duration_ms": 18,
"origin": {
"trace_id": "external-trace-id",
"span_id": "external-span-id",
"parent_span_id": "external-parent-span-id"
},
"evidence": {
"operation": "retrieval"
},
"attributes": {},
"metadata": {
"trigger": "rag_retrieval"
}
}
]
}Events may also use legacy source_id and target_id fields directly. In Trace v1, source_ref.id and target_ref.id are the graph node IDs used by the overlay. Adapters may preserve external trace IDs and span IDs in origin, but they should not silently create graph nodes, capabilities, resources, trust levels, or security findings.
Trace adapter contract:
class TraceAdapter(Protocol):
format_name: str
def load(self, path: Path) -> TraceLog:
...The core overlay only consumes TraceLog; source-specific parsing belongs in adapters under cognigraph.trace.adapters.
Supported runtime edge types:
| Runtime edge | Intended meaning |
|---|---|
PASSED_TO |
Context or data passed into an agent |
RETRIEVED_FROM |
Context retrieved from a source |
INVOKED |
Agent or tool invoked another tool |
READ_FROM |
Tool read from a resource |
WROTE_TO |
Tool wrote to a resource |
EXECUTED_IN |
Agent or tool executed in an environment |
Overlay semantics:
- Direct runtime events only mark static edges observed when the runtime edge type is compatible with the static edge.
- Tool-to-resource runtime events are projected onto declared capability paths when possible. For example,
filesystem_tool READ_FROM ssh_private_keymaps tofilesystem_tool -> SecretRead -> ssh_private_key. - Events that cannot match or project onto the static graph are reported as unexpected runtime-only edges.
--trace-format otlp-json reads OpenTelemetry-style JSON span exports and converts spans into CogniGraph trace events only when the span has all three explicit attributes:
| Attribute | Meaning |
|---|---|
cognigraph.source_id |
Existing CogniGraph source node ID |
cognigraph.target_id |
Existing CogniGraph target node ID |
cognigraph.edge_type |
Runtime edge type such as INVOKED, PASSED_TO, READ_FROM, or WROTE_TO |
Unannotated spans are ignored. This adapter does not infer security semantics from arbitrary telemetry; it only normalizes human- or instrumentation-supplied CogniGraph attributes.
--trace-format otlp-genai (alias genai) reads the same OTLP JSON span exports but requires no CogniGraph-specific attributes. It maps spans that follow the OpenTelemetry GenAI agent and MCP semantic conventions:
| Span signal | Mapped to |
|---|---|
gen_ai.operation.name = execute_tool or mcp.method.name = tools/call (or span name tools/call <tool>) |
An INVOKED runtime event |
gen_ai.tool.name / mcp.tool.name / the tools/call <tool> span name |
The invoked tool's node ID |
gen_ai.agent.name (or gen_ai.agent.id) on the span or its nearest ancestor span |
The invoking agent's node ID |
Names are normalized the same way cognigraph collect normalizes server names, so spans emitted by an instrumented MCP host (e.g. FastMCP's built-in OTel telemetry) resolve against collected fixtures without manual mapping. Tool-call spans without a resolvable agent are skipped, and span content (prompts, messages, arguments) is never copied into trace events. Unmatched events degrade to runtime-only edges, as with any adapter.
Both OTel convention families are still marked Development; the adapter pins the attribute names listed above and will be versioned deliberately if the conventions change.
Instead of hand-writing a fixture, cognigraph collect reads the MCP config your client already uses and emits a fixture skeleton:
# From a Claude Desktop / Claude Code / Cursor config (mcpServers key)
uv run cognigraph collect ~/path/to/claude_desktop_config.json -o my_system.yaml
# From a VS Code config (servers key, JSONC tolerated)
uv run cognigraph collect .vscode/mcp.json -o my_system.yaml
# Try it with the bundled sample config
uv run cognigraph collect fixtures/sample_mcp_config.json -o my_system.yamlThe collector emits exactly what the config proves, plus minimal scaffolding:
- one
MCPServernode per configured server - one stub tool per server (
<server>_tool, empty capabilities) with the command or URL recorded in its description - one
host_agent(trust 2) that consumesuser_input(trust 1) and can invoke every tool — modeling the real topology of an MCP client app - the standard capability taxonomy, seeded so annotation files validate immediately (disable with
--no-seed-capabilities)
Configs declare servers, not tools, so the default skeleton under-models servers that expose many tools. With --introspect, the collector connects to each configured server, calls tools/list, and emits one tool node per real tool (<server>_<tool>) with the server-provided name and description — which feed directly into --annotations and --infer-capabilities:
uv sync --extra introspect # or: pip install 'cognigraph[introspect]'
uv run cognigraph collect my_mcp_config.json --introspect -o my_system.yaml
uv run cognigraph my_system.yaml --infer-capabilitiesBe deliberate about what you point this at: introspecting a stdio server spawns its configured command as a process, and introspecting a remote server contacts its URL. It is never the default, and it requires the optional mcp dependency (the introspect extra). A server that cannot be reached within --introspect-timeout seconds (default 10) degrades to its stub tool with a warning, so one dead server never sinks the collection. Introspection enumerates structure only — tool names and descriptions — and still never invents capability semantics.
The collector never invents capability semantics — that mapping stays human-reviewed. Complete the workflow with annotations:
# 1. Collect the skeleton
uv run cognigraph collect fixtures/sample_mcp_config.json -o my_system.yaml
# 2. Declare what each tool can actually do (or preview with --infer-capabilities)
cat > my_annotations.yaml <<'EOF'
tool_capability_annotations:
filesystem_tool:
capabilities:
- SecretRead
- FilesystemRead
github_tool:
capabilities:
- GitHubPush
- ExternalNetworkSend
EOF
# 3. Analyze
uv run cognigraph my_system.yaml --annotations my_annotations.yaml --html-report report.htmlcollect exits 0 on success and 1 on error; it never produces findings itself.
A fixture can be written by hand or collected from a real MCP client config (see below). Here's a minimal hand-written example:
analysis:
max_tool_invocation_depth: 5
max_path_length: 8
context_sources:
- id: user_input
source_type: user_input
trust_level: 1
agents:
- id: assistant
trust_level: 2
consumes:
- user_input
can_invoke:
- search_tool
tools:
- id: search_tool
capabilities:
- ExternalNetworkSend
capabilities:
- id: ExternalNetworkSend
severity: 3See fixtures/sample_fixture.yaml for a full example with MCP servers, resources, and capability bindings.
Tool capabilities can be declared inline in the fixture or supplied through a separate annotation file. The annotation workflow is useful when the base fixture comes from a tool or MCP inventory and security capability mapping should remain human-reviewed.
tool_capability_annotations:
filesystem_tool:
capabilities:
- SecretRead
- FilesystemRead
github_tool:
capabilities:
- GitHubPush
- ExternalNetworkSendRun it with:
uv run cognigraph examples/rag_mcp_unannotated.yaml --annotations examples/manual_tool_annotations.yamlAnnotations are deterministic overlays. Unknown tools, unknown capabilities, and missing required resource bindings are rejected during fixture validation.
--infer-capabilities applies simple keyword rules to tool IDs and optional tool descriptions:
tools:
- id: filesystem_tool
description: Read files and secrets from the local filesystem.The mapper only adds capabilities already declared in the fixture. It does not create new capability definitions and does not use an LLM. Treat it as a convenience layer after manual annotations, not as a source of authority for the security graph.
| Type | Description | Key Attributes |
|---|---|---|
| ContextSource | Input entering an agent (user input, web content, RAG results) | trust_level (0-4), source_type |
| Agent | LLM planner or orchestrator | trust_level (0-4) |
| Tool | Action an agent can invoke | mcp_server (optional) |
| MCPServer | MCP server backing one or more tools | |
| Capability | Privileged action (shell exec, secret read, network send) | severity (1-4) |
| Resource | Target object (SSH key, database, repository) | sensitivity (1-4) |
| Policy | Approval/control boundary applied to agents, tools, or servers | effect (mitigate/downgrade) |
| Level | Label | Example |
|---|---|---|
| 0 | Untrusted | External webpage, unknown API response |
| 1 | Low | User input, RAG retrieval result |
| 2 | Medium | Internal agent, verified memory |
| 3 | High | Signed internal data |
| 4 | Privileged | Local filesystem, system config |
trust_level is optional on context sources. When omitted, it defaults by source_type: webpage and external_api → 0, retrieval and user_input → 1, memory → 2. An explicit value always wins.
Rule thresholds and the dangerous-pair list are configurable through an optional policy block (defaults shown — omitting the block keeps current behavior exactly):
policy:
critical_severity: 3 # capability severity >= this is "critical" (R001, R004, R005)
sensitive_sensitivity: 3 # resource sensitivity >= this is "sensitive" (R002)
low_trust_max: 1 # trust_level <= this is "low trust" (R001, R002, R005)
dangerous_pairs: # capability pairs R003 looks for
- [SecretRead, ExternalNetworkSend]
- [FilesystemRead, EmailSend]
- [ShellExecution, ExternalNetworkSend]
- [GitHubRead, GitHubPush]
- [BrowserAutomation, CredentialAccess]Both the in-memory engine and the Neo4j Cypher rules honor the policy block.
Approval gates, human-in-the-loop reviews, and similar controls are modeled as Policy nodes applied to agents, tools, or MCP servers — so the analyzer can see its own recommended remediations:
policies:
- id: fs_approval
applies_to: [filesystem_tool] # agents, tools, or MCP servers
effect: mitigate # mitigate (default) | downgrade
description: Human approval required before filesystem accesseffect: mitigate— findings whose risk is fully gated are reported under "Mitigated by Policy" and excluded from the exit code, like accepted risks.effect: downgrade— findings stay active but drop one severity level (useful with--fail-on).
Mitigation is sound, not path-cosmetic: a capability only counts as gated when every reachable tool exposing it is covered by a policy (or the agent itself is gated). One unprotected alternative path keeps the finding group active. A policy on an MCP server extends to all tools it backs.
Edges are derived from the fixture's consumes, can_invoke, capabilities, mcp_server, and capability_bindings fields:
ContextSource -[CONSUMED_BY]-> Agent
Agent -[CAN_INVOKE]-> Tool
Tool -[CAN_INVOKE]-> Tool
Tool -[EXPOSES_CAPABILITY]-> Capability
Capability -[CAN_ACCESS_RESOURCE]-> Resource
Tool -[USES_SERVER]-> MCPServer
Policy -[APPLIES_TO]-> Agent | Tool | MCPServer
| Rule | Triggers When |
|---|---|
| R001 | Low-trust context (trust <= 1) can reach a capability with severity >= 3 |
| R002 | Low-trust context can reach a resource with sensitivity >= 3 |
| R003 | A single agent can reach a dangerous capability pair (e.g. SecretRead + ExternalNetworkSend) |
| R004 | An MCP server backs critical tools invokable by more than N agents (default 3) |
| R005 | Low-trust context enters a higher-trust agent (trust >= 2) that can reach a critical capability |
- SecretRead + ExternalNetworkSend
- FilesystemRead + EmailSend
- ShellExecution + ExternalNetworkSend
- GitHubRead + GitHubPush
- BrowserAutomation + CredentialAccess
For interactive graph exploration and Cypher queries:
# Start Neo4j
docker compose up -d
# Run Neo4j integration tests
uv run pytest -m neo4j
# Run all tests (in-memory + Neo4j)
uv run pytestNeo4j is available at http://localhost:7474 (credentials: neo4j / cognigraph).
The Cypher query engine mirrors all five detection rules, so you get identical findings whether using the in-memory engine or Neo4j.
src/cognigraph/
schemas/ # Pydantic models: nodes, edges, enums, findings
fixture/ # YAML fixture loading and validation
graph/ # In-memory graph builder (NetworkX)
rules/ # Detection rules engine
neo4j/ # Neo4j client and Cypher detection queries
export.py # DOT and JSON graph export
report.py # CLI finding report formatter
cli.py # CLI entry point
fixtures/
sample_fixture.yaml
tests/
# Full suite (skips Neo4j tests if container is not running)
uv run pytest
# With coverage
uv run pytest --cov=cognigraph --cov-report=term-missingThe default coverage gate measures the core in-memory MVP path and omits the optional Neo4j adapter, whose tests require a running Neo4j container.
Copyright 2026 Naveen Prakaasham Vairaprakasam
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.