HuntMCP

Deterministic-first SOC Investigation Platform • MCP-Compatible Blue-Team Toolkit

HuntMCP is a portfolio-grade, deterministic-first SOC investigation prototype. It parses exported security logs, normalizes events, applies rule-based hunting logic, extracts IOCs, enriches them with CTI sources, and produces an analyst-style investigation report.

What HuntMCP Is

Deterministic Detection Engine: Rule-based, auditable detection with Sigma-inspired logic
MCP-Compatible: Designed for Model Context Protocol integration with security agents
LLM-Assisted Triage Only: The LLM is an analyst assistant, never the detection engine
Offline-First: Mock/local defaults for CTI and LLM enable safe, offline demos
Security-Hardened: Path validation, secret redaction, upload limits, no shell execution

What It Does

HuntMCP can:

Parse: CSV, JSON, and JSONL log exports (Windows Security, Sysmon, DNS, Proxy/Web, generic CSV)
Detect: Deterministic rule engine with ~20 detection rules covering common attack patterns
Enrich: IOC extraction with mock/local CTI and optional external CTI connectors (URLhaus, AbuseIPDB, OTX, VirusTotal)
Triage: OpenAI-compatible LLM for analyst assistance with deterministic mock fallback
Report: Markdown and HTML investigation reports with timeline, IOC tables, and MITRE ATT&CK mapping
Persist: SQLite case storage for investigation review and API access
Serve: FastAPI backend with MCP server tools for parse, detect, enrich, triage, report, timeline, entity graph, and case management

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                         HuntMCP Architecture                            │
└─────────────────────────────────────────────────────────────────────────┘

┌──────────────┐    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│   Parsers    │───▶│  Detection   │───▶│  IOC Engine  │───▶│ Enrichment   │
│              │    │   Engine     │    │              │    │              │
│ • Windows    │    │ • Rules      │    │ • Extract    │    │ • Mock CTI   │
│ • Sysmon     │    │ • Correlation│    │ • Normalize  │    │ • URLhaus    │
│ • DNS        │    │ • Thresholds │    │ • Deduplicate│    │ • AbuseIPDB  │
│ • Proxy/Web  │    │              │    │              │    │ • OTX        │
│ • Generic    │    │              │    │              │    │ • VirusTotal │
└──────────────┘    └──────────────┘    └──────────────┘    └──────────────┘
       │                   │                   │                   │
       ▼                   ▼                   ▼                   ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         Core Engine Layer                                │
├──────────────┬──────────────┬──────────────┬──────────────┬──────────────┤
│   Timeline   │ Entity Graph │   Storage    │   Security   │   Reporting  │
│              │              │              │              │              │
│ • Chronological│ • Build     │ • SQLite     │ • Path       │ • Markdown   │
│ • Events      │ • Pivot      │ • Cases      │   Validation │ • HTML       │
│ • Findings    │ • Relations  │ • Runs       │ • Secret     │ • JSON       │
│ • IOCs        │ • Centrality │ • Events     │   Redaction  │ • Evidence   │
└──────────────┴──────────────┴──────────────┴──────────────┴──────────────┘
       │                   │                   │                   │
       ▼                   ▼                   ▼                   ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                        Interface Layer                                   │
├──────────────┬──────────────┬──────────────┬──────────────┬──────────────┤
│     CLI      │    FastAPI   │   MCP Server │   Optional   │              │
│              │              │              │   AI Layer   │              │
│ • huntmcp    │ • /health    │ • parse      │ • LLM Triage │              │
│ • parse      │ • /investigate│ • detect     │ • Mock Fallback│            │
│ • detect     │ • /cases     │ • enrich     │ • Redaction  │              │
│ • enrich     │ • /findings  │ • triage     │ • Safety     │              │
│ • triage     │ • /iocs      │ • report     │   Prompts    │              │
│ • report     │ • /timeline  │ • timeline   │              │              │
│ • coverage   │ • /entity-graph│ • entity    │              │              │
└──────────────┴──────────────┴──────────────┴──────────────┴──────────────┘

Quick Start

Installation

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -e .

Run Self-Test

huntmcp self-test

Expected output:

{
  "ok": true,
  "event_count": 5,
  "finding_count": 5,
  "enriched_count": 5,
  "triage_count": 5
}

Demo Workflow

# Parse demo logs
huntmcp parse --input data/sample_logs/demo_attack.csv --type auto

# Run detection
huntmcp detect

# Enrich with mock CTI (offline-safe)
huntmcp enrich --cti mock

# Generate report
huntmcp report

Run Tests

python -m pytest -q
python -m ruff check huntmcp huntmcp.py tests
python -m ruff format --check huntmcp huntmcp.py tests

Demo Attack Dataset

The repository includes a demo attack dataset at data/sample_logs/demo_attack.csv containing a realistic attack chain:

Multiple failed logons (credential spraying)
Successful logon after failures (brute force success)
Suspicious PowerShell encoded command (obfuscation)
LOLBin execution (certutil downloading payload)
DNS beacon-like events (C2 communication)
Suspicious proxy/web URL (malware delivery)
Possible data exfiltration (large POST upload)

This dataset triggers multiple detection rules and produces a realistic investigation report for portfolio demonstrations.

MCP Usage

Start MCP Server

python -m huntmcp.mcp_server.server

Key MCP Tools

huntmcp_parse_logs: Parse security logs into normalized events
huntmcp_detect: Run deterministic detection rules
huntmcp_extract_and_enrich_iocs: Extract and enrich IOCs
huntmcp_triage_findings: LLM-assisted triage (mock fallback)
huntmcp_generate_report: Generate investigation report
huntmcp_run_pipeline: Full investigation pipeline
huntmcp_generate_timeline: Generate event timeline
huntmcp_build_entity_graph: Build entity relationship graph
huntmcp_get_cases: List all cases
huntmcp_get_case_details: Get case details
huntmcp_get_case_timeline: Get case timeline

Example MCP Workflow

# Parse sample logs
parse_result = huntmcp_parse_logs(
    input_path="data/sample_logs/demo_attack.csv",
    log_type="auto"
)

# Run detection
detect_result = huntmcp_detect(events=parse_result["events"])

# Generate timeline
timeline_result = huntmcp_get_case_timeline(
    case_id=detect_result["case_id"]
)

# Get report
report_result = huntmcp_get_case_report(
    case_id=detect_result["case_id"]
)

Security Model

HuntMCP is designed with security as a core principle:

Offline-First Defaults

CTI enrichment uses mock/local data by default
LLM triage uses deterministic mock fallback when no API key is configured
All tests run offline without external dependencies

Path Validation

Input paths are validated to prevent directory traversal
/investigations path restricted to workspace
Blocked paths: .env, .git, config files, sensitive directories

Secret Redaction

API keys are never logged or sent to LLM
Redaction supports: usernames, internal IPs, hostnames, emails, tokens, cookies
Secret patterns are stripped from all outputs

Upload Validation

Maximum file size: 10MB (configurable via HUNTMCP_MAX_UPLOAD_SIZE_BYTES)
Blocked extensions: .exe, .dll, .bat, .cmd, .ps1, .vbs, .js, .jar, .sh
Content validation for malicious payloads

LLM Safety

LLM is assistant-only, never the detection engine
Prompt templates warn LLM not to follow instructions from log content
Log content treated as untrusted input
Only selected suspicious context sent to LLM

MCP Safety

No arbitrary shell execution via MCP tools
File operations restricted to safe paths
No exposure of sensitive system information
Tool schemas match actual implementations

Example Outputs

Detection Findings

{
  "rule_id": "multiple_failed_logons",
  "title": "Multiple Failed Logons",
  "severity": "high",
  "summary": "12 failed logons within 600s for user=jsmith host=WORKSTATION-01",
  "matched_event_count": 12,
  "first_seen": "2024-01-15T10:23:45Z",
  "last_seen": "2024-01-15T10:33:12Z",
  "mitre_attack": ["T1110.003"],
  "iocs": ["192.168.1.100", "jsmith"]
}

Timeline

2024-01-15 10:23:45 - Failed logon (user: jsmith, source: 192.168.1.100)
2024-01-15 10:24:12 - Failed logon (user: jsmith, source: 192.168.1.100)
2024-01-15 10:25:01 - Failed logon (user: jsmith, source: 192.168.1.100)
2024-01-15 10:33:12 - Successful logon (user: jsmith, source: 192.168.1.100)
2024-01-15 10:35:22 - PowerShell encoded command execution
2024-01-15 10:36:45 - DNS query to suspicious domain (evil.example.com)
2024-01-15 10:38:01 - Large POST upload to external endpoint

IOC Enrichment

{
  "ioc_value": "evil.example.com",
  "ioc_type": "domain",
  "cti_sources": {
    "urlhaus": {"status": "malicious", "threat": "malware_download"},
    "abuseipdb": {"abuse_confidence_score": 100, "last_reported_at": "2024-01-14"},
    "otx": {"pulses": 3, "sections": ["malware_domains", "c2"]}
  }
}

Report Formats

Markdown: Structured report with executive summary, findings, timeline, IOC tables
HTML: Styled HTML report with severity highlighting and interactive tables
JSON: Machine-readable output for integration

Project Status

This repository is intended as a cybersecurity portfolio project and advanced MVP, not as a production SIEM or detection replacement.

Current validated state:

CLI pipeline works end to end
FastAPI backend supports investigation jobs and persisted case summaries
SQLite persistence stores cases, runs, events, findings, IOCs, enrichments, triage results, reports
Deterministic tests run offline even when a local .env exists
Current validation baseline: 300+ passed, ruff check clean, ruff format clean
MCP server with 20+ tools for security agent integration

What It Is Not

HuntMCP is not a SIEM
HuntMCP is not a production detection replacement
HuntMCP does not perform actor attribution
HuntMCP does not automatically prove malicious activity
Generated reports require analyst review
Public CTI sources can be noisy, stale, incomplete, or unavailable

Supported Log Types

Windows Security
Sysmon
DNS
Proxy/Web
Generic CSV

Detection Rules

The default rule set includes ~20 detection rules covering:

Windows Security/Sysmon:

Multiple failed logons (password spraying)
Failed logon followed by successful logon (brute force success)
Suspicious PowerShell encoded command
New local admin user creation
PowerShell download cradle
LOLBin execution (certutil, rundll32, regsvr32, mshta, bitsadmin)
Suspicious parent-child process
Possible credential dumping / LSASS access
Remote service creation
Scheduled task creation
Registry persistence

DNS:

DNS beaconing candidate
High entropy/random-looking domain
Long domain name
Repeated beacon-like DNS queries
NXDOMAIN spike

Proxy/Web:

Known suspicious URL/domain access
Suspicious user-agent
Large POST/upload
Suspicious TLD access
Repeated C2-like callback pattern

The engine is deterministic and auditable. All rules include MITRE ATT&CK tactic/technique mapping where applicable.

Configuration

Create a local .env from the example:

Copy-Item .env.example .env

Example variables:

OPENAI_API_KEY=
LLM_MODEL=mimo-v2.5
OPENAI_BASE_URL=
URLHAUS_AUTH_KEY=
ABUSEIPDB_API_KEY=
OTX_API_KEY=
VIRUSTOTAL_API_KEY=
LLM_TIMEOUT_SECONDS=30
HUNTMCP_MAX_UPLOAD_SIZE_BYTES=10485760
HUNTMCP_MAX_LLM_FINDINGS=50
HUNTMCP_CTI_LOOKUP_LIMIT=250
HUNTMCP_CORS_ALLOW_ORIGINS=

CLI Commands

After installing with pip install -e ., you can use the huntmcp command directly:

# Parse logs
huntmcp parse --input data/sample_logs/demo_attack.csv --type auto

# Run detection
huntmcp detect

# Enrich findings
huntmcp enrich --cti mock

# Run LLM/mock triage
huntmcp triage --limit 5

# Generate report
huntmcp report

# Run persisted investigation workflow
huntmcp init-db
huntmcp investigate --input data/sample_logs/demo_attack.csv --type auto --cti mock --case-id demo-attack --case-name "Demo Attack Investigation"
huntmcp case-summary --case-id demo-attack

# Generate MITRE ATT&CK coverage report
huntmcp coverage

# Validate configuration
huntmcp validate-config

# List detection rules
huntmcp rules list
huntmcp rules show --rule-id multiple_failed_logons

API Demo

Start the API:

uvicorn huntmcp.api:app --reload

Set explicit CORS origins if needed:

$env:HUNTMCP_CORS_ALLOW_ORIGINS="http://localhost:3000,http://localhost:5173"
uvicorn huntmcp.api:app --reload

Useful endpoints:

GET  /health
POST /investigate
POST /investigations
GET  /jobs/{job_id}
GET  /cases
GET  /cases/{case_id}/summary
GET  /cases/{case_id}/timeline
GET  /cases/{case_id}/entity-graph
GET  /findings/{case_id}
GET  /iocs/{case_id}
GET  /reports/{case_id}
GET  /rules
GET  /rules/{rule_id}
GET  /coverage
GET  /admin/migration/status
POST /admin/migrate
GET  /admin/audit-logs
GET  /dashboard

Testing

Run the full validation suite:

python -m pytest -q
python -m ruff check huntmcp huntmcp.py tests
python -m ruff format --check huntmcp huntmcp.py tests

Current local validation:

300+ passed
All ruff checks passed
All files formatted

Repository Hygiene

The repository intentionally excludes:

.env and .env.example (contains sensitive configuration)
Python caches (__pycache__, .pytest_cache, .ruff_cache)
SQLite databases (*.sqlite, *.sqlite3)
Generated normalized events (data/normalized/*.json)
Generated findings (data/findings/*.json)
Generated enrichment output (data/enriched/*.json)
Generated reports (reports/*.md, reports/*.json)
Downloaded public CTI datasets (data/cache/, data/uploads/)
Build artifacts (dist/, build/, *.egg-info/)

Only source code, tests, docs, configs, and sample logs should be pushed.

Limitations

This is a portfolio-grade prototype, not a production SOC platform
The current detection set is intentionally small (~20 rules)
SQLite is suitable for local/single-user workflows, not multi-tenant production
Very large datasets need a true stateful streaming detection engine
External CTI quality depends on third-party availability and rate limits
LLM output can be wrong and must be reviewed by an analyst
API authentication, RBAC, audit logging, and deployment hardening are future work

Future Work

More Sigma-compatible rule loading and rule metadata
Richer analyst UI for timeline, IOC pivoting, and finding review
API authentication and role-based access control
Job queue backend for longer investigations
More CTI connector response normalization and rate-limit handling
Docker deployment profiles and production secret management
True stateful streaming detection for very large datasets
Enhanced MCP server wrappers for all modules

License

See LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
config		config
data		data
docs		docs
huntmcp		huntmcp
reports		reports
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
huntmcp.py		huntmcp.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

HuntMCP

What HuntMCP Is

What It Does

Architecture

Quick Start

Installation

Run Self-Test

Demo Workflow

Run Tests

Demo Attack Dataset

MCP Usage

Start MCP Server

Key MCP Tools

Example MCP Workflow

Security Model

Offline-First Defaults

Path Validation

Secret Redaction

Upload Validation

LLM Safety

MCP Safety

Example Outputs

Detection Findings

Timeline

IOC Enrichment

Report Formats

Project Status

What It Is Not

Supported Log Types

Detection Rules

Configuration

CLI Commands

API Demo

Testing

Repository Hygiene

Limitations

Future Work

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages