chumdump is a defensive CLI for inspecting AI-readable environments,
detecting chum-like content, generating controlled chumbait, and
monitoring whether AI crawlers, agents, or retrieval systems consume it.
Use it to create controlled bait for crawlers, agents, models, and RAG systems, then watch whether the bait is accessed, retrieved, echoed, obeyed, or leaked.
- Generates canaries, crawler bait, prompt traps, RAG bait, fake harmless secrets, watermarks, lore seeds, and decoy documents.
- Builds deployable chumdump bundles with manifests and index pages.
- Deploys bundles into owned websites, repositories, docs, or test corpora.
- Scans paths for known bait markers.
- Parses access logs and records bite events.
- Generates Markdown, JSON, HTML, or SARIF-style reports.
- Cleans up deployed bait files from a marked destination.
Current release: v0.1.0-alpha.
chumdump is an early alpha. The core local workflow is usable, but
the command surface and report schema may change before a stable
release.
Install the current alpha from GitHub:
python3 -m pip install \
"git+https://github.com/m3lixir/chumdump.git@v0.1.0-alpha"Install from a local checkout for development:
python3 -m pip install -e ".[dev]"Then check the CLI:
chumdump --helpYou can also run the CLI directly from a checkout:
PYTHONPATH=src python3 -m chumdump --helpThis creates a local project, generates bait, deploys a small dump, simulates one crawler-style access log entry, records the bite, and prints a report.
chumdump init ai-crawler-test
cd ai-crawler-test
chumdump bait create --type canary --name violet-harbor
chumdump dump build --profile website --count 1
chumdump deploy ./public
bait_file=$(basename "$(find public/bait -name '*.md' | head -n 1)")
cat > access.log <<EOF
203.0.113.10 - - [18/Jun/2026:12:00:00 +0000] "GET /bait/${bait_file} HTTP/1.1" 200 123 "-" "GPTBot/1.0"
EOF
chumdump watch --logs ./access.log
chumdump bites
chumdump report --format markdown --stdoutExpected bite summary:
The GPTBot/1.0 user agent above is a local fixture. No real crawler
visits the quickstart project.
Detected bites: 1
- accessed bait-canary-violet-harbor-... via /bait/bait-canary-violet-harbor-....md (GPTBot)
TYPE TIME ACTOR BAIT
accessed 18/Jun/2026:12:00:00 GPTBot bait-canary-violet-harbor-...
chumdump should not merely generate bait. It should preserve evidence
of the bite.
Every bite should help answer:
- What bait was touched?
- Where was it placed?
- When was it accessed?
- What accessed it?
- What evidence supports that?
- Was it accessed, retrieved, echoed, obeyed, leaked, or unclear?
Bite types form the escalation ladder:
| Bite type | Meaning |
|---|---|
accessed |
Something requested the bait. |
retrieved |
A RAG or search system surfaced the bait. |
echoed |
A model or summarizer repeated the bait. |
obeyed |
An agent followed bait instructions. |
leaked |
Bait appeared somewhere unexpected. |
unknown |
Evidence exists, but the behavior is unclear. |
The typical escalation path is accessed < retrieved < echoed <
obeyed.
The bite model is the heart of the tool. Without evidence, chumdump is only a bait generator. With evidence, it becomes a small forensic record for AI ingestion and agent-behavior testing.
flowchart TD
init["init<br/>create project"]
bait["bait create/list<br/>make controlled artifacts"]
dump["dump build/create<br/>bundle a corpus"]
deploy["deploy<br/>place in owned environment"]
observe["AI-readable surface<br/>website, docs, RAG, lab agent"]
watch["watch<br/>parse logs and telemetry"]
bites["bites<br/>review evidence events"]
report["report<br/>produce Markdown, JSON, HTML, or SARIF"]
scan["scan<br/>inspect existing corpus"]
cleanup["cleanup<br/>remove deployed bait"]
init --> bait --> dump --> deploy --> observe --> watch --> bites --> report
scan --> bait
deploy --> cleanup
observe -. "crawler, retrieval, echo, or action" .-> watch
Create a project:
chumdump init ai-crawler-test
cd ai-crawler-testCreate canary and prompt-trap bait:
chumdump bait create --type canary --name violet-harbor
chumdump bait create --type prompt-trap --target agentBuild and deploy a website-oriented dump:
chumdump dump build --profile website --count 8
chumdump deploy ./publicWatch access logs for bites:
chumdump watch --logs ./access.log
chumdump bitesGenerate a report:
chumdump report --format markdownThe core command loop is:
chumdump initchumdump bait createchumdump bait listchumdump dump buildchumdump deploychumdump scanchumdump watchchumdump biteschumdump reportchumdump cleanup
For command details, see docs/commands.md.
Use chumdump only in environments you own or are authorized to test.
Appropriate uses include:
- Testing your own website.
- Testing your own documentation.
- Testing an internal RAG corpus.
- Testing a lab agent.
- Deploying canaries to owned infrastructure.
- Monitoring your own logs.
Inappropriate uses include:
- Deploying bait on third-party systems without permission.
- Trying to poison public models.
- Tricking agents into unsafe actions.
- Collecting real credentials.
- Generating deceptive content farms.
- Bypassing access controls.
Chumdump is a defensive research tool. Keep the bait clean.