Skip to content

Nav-Prak/CogniGraph

Repository files navigation

CogniGraph

Graph-native capability reachability analysis for agentic AI systems.

Agentic AI systems compose LLM planners, MCP servers, tools, retrieval systems, and external APIs into complex capability topologies. CogniGraph models these as a directed graph and answers a single question:

What dangerous capabilities can low-trust context reach?

It takes a YAML fixture describing your system's agents, tools, MCP servers, capabilities, and resources, builds a privilege graph, and runs detection rules to find dangerous paths, like untrusted web content reaching secret access, or a single agent having both secret-read and network-send capabilities.

Prerequisites

  • Python 3.11+
  • uv (Python package manager)
  • Docker (optional, for Neo4j integration)

Getting Started

# Clone and install
git clone <repo-url> && cd Hound-AI
uv sync

# Run the five-minute MVP demo
uv run cognigraph examples/rag_mcp_vulnerable.yaml --html-report report.html --findings-json findings.json

This loads a vulnerable RAG/MCP fixture, builds the graph, runs all detection rules, writes a static HTML report, and exports structured JSON findings. Open report.html to inspect the risky paths, node metadata, and recommended controls.

The CLI also prints a text report:

CogniGraph Analysis Report
==================================================
Total findings: 11
  CRITICAL: 7
  HIGH: 4
==================================================

--- Finding 1 ---
[R001] [CRITICAL] Low-trust context reaches critical capability
  Low-trust context 'external_webpage' can reach capability 'SecretRead' (severity 4)
  Path: external_webpage -> research_planner -> filesystem_tool -> SecretRead
  Recommended control: Restrict low-trust context from reaching this capability...
...

MVP Examples

The examples/ directory contains the core demo fixtures:

Fixture Purpose Expected result
examples/rag_mcp_vulnerable.yaml Low-trust web/RAG context reaches secret, repository, and network capabilities R001, R002, R003, and R005 findings
examples/rag_mcp_unannotated.yaml + examples/manual_tool_annotations.yaml Same topology as the vulnerable demo, with tool capabilities supplied through a separate annotation file Same findings as the inline vulnerable fixture
examples/least_privilege_safe.yaml Low-trust input can only reach low-severity documentation search over low-sensitivity public docs No findings
examples/overexposed_mcp.yaml Three agents can invoke one MCP-backed critical capability with threshold set to 3 R004 finding

A safe fixture means no low-trust context can reach severity >= 3 capabilities, no low-trust context can reach sensitivity >= 3 resources, no agent can compose a configured dangerous capability pair, no MCP server exceeds the overexposure threshold, and no higher-trust agent consuming low-trust context has downstream critical capability access.

CLI Usage

# Print findings report to stdout
uv run cognigraph examples/rag_mcp_vulnerable.yaml

# Export graph as Graphviz DOT (finding paths highlighted in red)
uv run cognigraph examples/rag_mcp_vulnerable.yaml --export-dot graph.dot

# Export graph as JSON
uv run cognigraph examples/rag_mcp_vulnerable.yaml --export-json graph.json

# Export findings as structured JSON
uv run cognigraph examples/rag_mcp_vulnerable.yaml --findings-json findings.json

# Apply manual tool capability annotations to a fixture
uv run cognigraph examples/rag_mcp_unannotated.yaml --annotations examples/manual_tool_annotations.yaml

# Preview: infer tool capabilities from tool IDs/descriptions with deterministic keyword rules
uv run cognigraph examples/rag_mcp_unannotated.yaml --infer-capabilities

# Export a static HTML report with finding paths and node metadata
uv run cognigraph examples/rag_mcp_vulnerable.yaml --html-report report.html

# Preview: overlay a CogniGraph Trace v1 file on the static graph
uv run cognigraph examples/rag_mcp_vulnerable.yaml --trace examples/runtime_trace_example.json --trace-format internal-json --html-report report.html

# Preview: overlay explicit CogniGraph attributes from OTLP-style JSON spans
uv run cognigraph fixtures/sample_fixture.yaml --trace fixtures/sample_otlp_trace.json --trace-format otlp-json --html-report report.html

# Preview: overlay standard OTel GenAI/MCP semantic-convention spans (no custom attributes needed)
uv run cognigraph collect fixtures/sample_mcp_config.json -o collected.yaml
uv run cognigraph collected.yaml --trace fixtures/sample_genai_trace.json --trace-format otlp-genai --html-report report.html

# Suppress stdout report (useful when only exporting)
uv run cognigraph examples/rag_mcp_vulnerable.yaml --quiet --export-dot graph.dot

# Render the DOT file to PNG (requires Graphviz installed)
dot -Tpng graph.dot -o graph.png

Exit codes: 0 = no findings, 1 = error, 2 = findings detected.

Structured JSON findings include the rule ID, severity, path, entities, and deterministic recommended_control guidance.

Finding Groups, Suppressions, and CI Gating

Findings are grouped by (rule, target) — e.g. all paths by which low-trust context reaches SecretRead form one group, with the individual paths kept as evidence. Groups are ranked for triage (highest severity, then shortest path, then lowest source trust) in both the text summary and the HTML report's Finding Groups panel, which badges each group as active, mitigated, or suppressed.

Reviewed, accepted risks can be suppressed:

# suppressions.yaml
suppressions:
  - rule_id: R001
    target: SecretRead
    reason: "Vault reads are gated by human approval (ticket SEC-142)"
    expires: 2026-12-31   # optional; an expired entry is an error
uv run cognigraph my_system.yaml --suppressions suppressions.yaml

Suppressed groups are reported under "Accepted Risks" and excluded from the exit-code decision. A suppression that matches no finding group, or has expired, fails with exit 1 — stale entries are treated as drift, not noise.

--fail-on {critical,high,any} (default any) sets the minimum active group severity that causes exit 2, which makes the analyzer usable as a CI gate:

# Fail the build only on critical, unsuppressed findings
uv run cognigraph my_system.yaml --suppressions suppressions.yaml --fail-on critical

This repo's own CI workflow doubles as a template: it keeps a guarded fixture free of dangerous paths and verifies the vulnerable demo still detects.

Runtime Trace Overlay Preview

CogniGraph has a small internal JSON trace format for runtime overlays. This is a preview feature, not a core MVP requirement. External trace systems should map into this format through adapters so the graph, rules, and report code can consume one stable representation.

The default adapter is internal-json, with compatibility aliases internal, cognigraph-trace-v1, and json.

{
  "schema": "cognigraph.trace.v1",
  "trace_id": "trace-001",
  "session_id": "session-abc",
  "source": {
    "type": "internal-json",
    "name": "CogniGraph example"
  },
  "events": [
    {
      "id": "event-001",
      "timestamp": "2026-05-01T10:00:00Z",
      "source_ref": {
        "id": "external_webpage",
        "type": "ContextSource",
        "name": "External webpage"
      },
      "target_ref": {
        "id": "research_planner",
        "type": "Agent",
        "name": "Research planner"
      },
      "edge_type": "PASSED_TO",
      "status": "ok",
      "duration_ms": 18,
      "origin": {
        "trace_id": "external-trace-id",
        "span_id": "external-span-id",
        "parent_span_id": "external-parent-span-id"
      },
      "evidence": {
        "operation": "retrieval"
      },
      "attributes": {},
      "metadata": {
        "trigger": "rag_retrieval"
      }
    }
  ]
}

Events may also use legacy source_id and target_id fields directly. In Trace v1, source_ref.id and target_ref.id are the graph node IDs used by the overlay. Adapters may preserve external trace IDs and span IDs in origin, but they should not silently create graph nodes, capabilities, resources, trust levels, or security findings.

Trace adapter contract:

class TraceAdapter(Protocol):
    format_name: str

    def load(self, path: Path) -> TraceLog:
        ...

The core overlay only consumes TraceLog; source-specific parsing belongs in adapters under cognigraph.trace.adapters.

Supported runtime edge types:

Runtime edge Intended meaning
PASSED_TO Context or data passed into an agent
RETRIEVED_FROM Context retrieved from a source
INVOKED Agent or tool invoked another tool
READ_FROM Tool read from a resource
WROTE_TO Tool wrote to a resource
EXECUTED_IN Agent or tool executed in an environment

Overlay semantics:

  • Direct runtime events only mark static edges observed when the runtime edge type is compatible with the static edge.
  • Tool-to-resource runtime events are projected onto declared capability paths when possible. For example, filesystem_tool READ_FROM ssh_private_key maps to filesystem_tool -> SecretRead -> ssh_private_key.
  • Events that cannot match or project onto the static graph are reported as unexpected runtime-only edges.

OTLP-Style JSON Adapter

--trace-format otlp-json reads OpenTelemetry-style JSON span exports and converts spans into CogniGraph trace events only when the span has all three explicit attributes:

Attribute Meaning
cognigraph.source_id Existing CogniGraph source node ID
cognigraph.target_id Existing CogniGraph target node ID
cognigraph.edge_type Runtime edge type such as INVOKED, PASSED_TO, READ_FROM, or WROTE_TO

Unannotated spans are ignored. This adapter does not infer security semantics from arbitrary telemetry; it only normalizes human- or instrumentation-supplied CogniGraph attributes.

OTel GenAI / MCP Semantic-Convention Adapter

--trace-format otlp-genai (alias genai) reads the same OTLP JSON span exports but requires no CogniGraph-specific attributes. It maps spans that follow the OpenTelemetry GenAI agent and MCP semantic conventions:

Span signal Mapped to
gen_ai.operation.name = execute_tool or mcp.method.name = tools/call (or span name tools/call <tool>) An INVOKED runtime event
gen_ai.tool.name / mcp.tool.name / the tools/call <tool> span name The invoked tool's node ID
gen_ai.agent.name (or gen_ai.agent.id) on the span or its nearest ancestor span The invoking agent's node ID

Names are normalized the same way cognigraph collect normalizes server names, so spans emitted by an instrumented MCP host (e.g. FastMCP's built-in OTel telemetry) resolve against collected fixtures without manual mapping. Tool-call spans without a resolvable agent are skipped, and span content (prompts, messages, arguments) is never copied into trace events. Unmatched events degrade to runtime-only edges, as with any adapter.

Both OTel convention families are still marked Development; the adapter pins the attribute names listed above and will be versioned deliberately if the conventions change.

Instead of hand-writing a fixture, cognigraph collect reads the MCP config your client already uses and emits a fixture skeleton:

# From a Claude Desktop / Claude Code / Cursor config (mcpServers key)
uv run cognigraph collect ~/path/to/claude_desktop_config.json -o my_system.yaml

# From a VS Code config (servers key, JSONC tolerated)
uv run cognigraph collect .vscode/mcp.json -o my_system.yaml

# Try it with the bundled sample config
uv run cognigraph collect fixtures/sample_mcp_config.json -o my_system.yaml

The collector emits exactly what the config proves, plus minimal scaffolding:

  • one MCPServer node per configured server
  • one stub tool per server (<server>_tool, empty capabilities) with the command or URL recorded in its description
  • one host_agent (trust 2) that consumes user_input (trust 1) and can invoke every tool — modeling the real topology of an MCP client app
  • the standard capability taxonomy, seeded so annotation files validate immediately (disable with --no-seed-capabilities)

Live Introspection (--introspect)

Configs declare servers, not tools, so the default skeleton under-models servers that expose many tools. With --introspect, the collector connects to each configured server, calls tools/list, and emits one tool node per real tool (<server>_<tool>) with the server-provided name and description — which feed directly into --annotations and --infer-capabilities:

uv sync --extra introspect   # or: pip install 'cognigraph[introspect]'
uv run cognigraph collect my_mcp_config.json --introspect -o my_system.yaml
uv run cognigraph my_system.yaml --infer-capabilities

Be deliberate about what you point this at: introspecting a stdio server spawns its configured command as a process, and introspecting a remote server contacts its URL. It is never the default, and it requires the optional mcp dependency (the introspect extra). A server that cannot be reached within --introspect-timeout seconds (default 10) degrades to its stub tool with a warning, so one dead server never sinks the collection. Introspection enumerates structure only — tool names and descriptions — and still never invents capability semantics.

The collector never invents capability semantics — that mapping stays human-reviewed. Complete the workflow with annotations:

# 1. Collect the skeleton
uv run cognigraph collect fixtures/sample_mcp_config.json -o my_system.yaml

# 2. Declare what each tool can actually do (or preview with --infer-capabilities)
cat > my_annotations.yaml <<'EOF'
tool_capability_annotations:
  filesystem_tool:
    capabilities:
      - SecretRead
      - FilesystemRead
  github_tool:
    capabilities:
      - GitHubPush
      - ExternalNetworkSend
EOF

# 3. Analyze
uv run cognigraph my_system.yaml --annotations my_annotations.yaml --html-report report.html

collect exits 0 on success and 1 on error; it never produces findings itself.

Writing a Fixture

A fixture can be written by hand or collected from a real MCP client config (see below). Here's a minimal hand-written example:

analysis:
  max_tool_invocation_depth: 5
  max_path_length: 8

context_sources:
  - id: user_input
    source_type: user_input
    trust_level: 1

agents:
  - id: assistant
    trust_level: 2
    consumes:
      - user_input
    can_invoke:
      - search_tool

tools:
  - id: search_tool
    capabilities:
      - ExternalNetworkSend

capabilities:
  - id: ExternalNetworkSend
    severity: 3

See fixtures/sample_fixture.yaml for a full example with MCP servers, resources, and capability bindings.

Manual Tool Annotations

Tool capabilities can be declared inline in the fixture or supplied through a separate annotation file. The annotation workflow is useful when the base fixture comes from a tool or MCP inventory and security capability mapping should remain human-reviewed.

tool_capability_annotations:
  filesystem_tool:
    capabilities:
      - SecretRead
      - FilesystemRead

  github_tool:
    capabilities:
      - GitHubPush
      - ExternalNetworkSend

Run it with:

uv run cognigraph examples/rag_mcp_unannotated.yaml --annotations examples/manual_tool_annotations.yaml

Annotations are deterministic overlays. Unknown tools, unknown capabilities, and missing required resource bindings are rejected during fixture validation.

Heuristic Capability Mapping Preview

--infer-capabilities applies simple keyword rules to tool IDs and optional tool descriptions:

tools:
  - id: filesystem_tool
    description: Read files and secrets from the local filesystem.

The mapper only adds capabilities already declared in the fixture. It does not create new capability definitions and does not use an LLM. Treat it as a convenience layer after manual annotations, not as a source of authority for the security graph.

Node Types

Type Description Key Attributes
ContextSource Input entering an agent (user input, web content, RAG results) trust_level (0-4), source_type
Agent LLM planner or orchestrator trust_level (0-4)
Tool Action an agent can invoke mcp_server (optional)
MCPServer MCP server backing one or more tools
Capability Privileged action (shell exec, secret read, network send) severity (1-4)
Resource Target object (SSH key, database, repository) sensitivity (1-4)
Policy Approval/control boundary applied to agents, tools, or servers effect (mitigate/downgrade)

Trust Levels

Level Label Example
0 Untrusted External webpage, unknown API response
1 Low User input, RAG retrieval result
2 Medium Internal agent, verified memory
3 High Signed internal data
4 Privileged Local filesystem, system config

trust_level is optional on context sources. When omitted, it defaults by source_type: webpage and external_api → 0, retrieval and user_input → 1, memory → 2. An explicit value always wins.

Policy Configuration

Rule thresholds and the dangerous-pair list are configurable through an optional policy block (defaults shown — omitting the block keeps current behavior exactly):

policy:
  critical_severity: 3        # capability severity >= this is "critical" (R001, R004, R005)
  sensitive_sensitivity: 3    # resource sensitivity >= this is "sensitive" (R002)
  low_trust_max: 1            # trust_level <= this is "low trust" (R001, R002, R005)
  dangerous_pairs:            # capability pairs R003 looks for
    - [SecretRead, ExternalNetworkSend]
    - [FilesystemRead, EmailSend]
    - [ShellExecution, ExternalNetworkSend]
    - [GitHubRead, GitHubPush]
    - [BrowserAutomation, CredentialAccess]

Both the in-memory engine and the Neo4j Cypher rules honor the policy block.

Approval Boundaries (Policies)

Approval gates, human-in-the-loop reviews, and similar controls are modeled as Policy nodes applied to agents, tools, or MCP servers — so the analyzer can see its own recommended remediations:

policies:
  - id: fs_approval
    applies_to: [filesystem_tool]      # agents, tools, or MCP servers
    effect: mitigate                   # mitigate (default) | downgrade
    description: Human approval required before filesystem access
  • effect: mitigate — findings whose risk is fully gated are reported under "Mitigated by Policy" and excluded from the exit code, like accepted risks.
  • effect: downgrade — findings stay active but drop one severity level (useful with --fail-on).

Mitigation is sound, not path-cosmetic: a capability only counts as gated when every reachable tool exposing it is covered by a policy (or the agent itself is gated). One unprotected alternative path keeps the finding group active. A policy on an MCP server extends to all tools it backs.

Relationships

Edges are derived from the fixture's consumes, can_invoke, capabilities, mcp_server, and capability_bindings fields:

ContextSource -[CONSUMED_BY]-> Agent
Agent         -[CAN_INVOKE]->  Tool
Tool          -[CAN_INVOKE]->  Tool
Tool          -[EXPOSES_CAPABILITY]-> Capability
Capability    -[CAN_ACCESS_RESOURCE]-> Resource
Tool          -[USES_SERVER]-> MCPServer
Policy        -[APPLIES_TO]-> Agent | Tool | MCPServer

Detection Rules

Rule Triggers When
R001 Low-trust context (trust <= 1) can reach a capability with severity >= 3
R002 Low-trust context can reach a resource with sensitivity >= 3
R003 A single agent can reach a dangerous capability pair (e.g. SecretRead + ExternalNetworkSend)
R004 An MCP server backs critical tools invokable by more than N agents (default 3)
R005 Low-trust context enters a higher-trust agent (trust >= 2) that can reach a critical capability

Dangerous Capability Pairs (R003)

  • SecretRead + ExternalNetworkSend
  • FilesystemRead + EmailSend
  • ShellExecution + ExternalNetworkSend
  • GitHubRead + GitHubPush
  • BrowserAutomation + CredentialAccess

Neo4j Integration

For interactive graph exploration and Cypher queries:

# Start Neo4j
docker compose up -d

# Run Neo4j integration tests
uv run pytest -m neo4j

# Run all tests (in-memory + Neo4j)
uv run pytest

Neo4j is available at http://localhost:7474 (credentials: neo4j / cognigraph).

The Cypher query engine mirrors all five detection rules, so you get identical findings whether using the in-memory engine or Neo4j.

Project Structure

src/cognigraph/
  schemas/        # Pydantic models: nodes, edges, enums, findings
  fixture/        # YAML fixture loading and validation
  graph/          # In-memory graph builder (NetworkX)
  rules/          # Detection rules engine
  neo4j/          # Neo4j client and Cypher detection queries
  export.py       # DOT and JSON graph export
  report.py       # CLI finding report formatter
  cli.py          # CLI entry point
fixtures/
  sample_fixture.yaml
tests/

Running Tests

# Full suite (skips Neo4j tests if container is not running)
uv run pytest

# With coverage
uv run pytest --cov=cognigraph --cov-report=term-missing

The default coverage gate measures the core in-memory MVP path and omits the optional Neo4j adapter, whose tests require a running Neo4j container.

License

Copyright 2026 Naveen Prakaasham Vairaprakasam

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

BloodHound for AI agents - graph-native capability reachability analysis for MCP and agentic AI systems

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages