Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
b1d41cf
feat: implement Bun lockfile parser, add threat catalog dynamic loadi…
miccy Apr 21, 2026
1fbcf8b
refactor: restructure threat data schema to support detailed IOC meta…
miccy Apr 21, 2026
8dd43cf
chore: update dependencies and clean up Playwright test artifacts
miccy Apr 21, 2026
8c5adb7
Potential fix for pull request finding 'CodeQL / Bad HTML filtering r…
miccy Apr 21, 2026
76e8fd2
chore: update playwright report base64 data in index.html
miccy Apr 21, 2026
6eac7e1
Update CHANGELOG.md
miccy Apr 22, 2026
34e0065
Update packages/scanner/src/parsers/bun.ts
miccy Apr 22, 2026
c29af09
Update CHANGELOG.md
miccy Apr 22, 2026
4fd988f
Update README.md
miccy Apr 22, 2026
2772865
Update cs/README.md
miccy Apr 22, 2026
84741f8
Update packages/engine/src/types.ts
miccy Apr 22, 2026
1a8cd0b
Update packages/ioc/index.js
miccy Apr 22, 2026
4aa1181
Update package.json
miccy Apr 22, 2026
acc5b12
Update packages/engine/src/validate.ts
miccy Apr 22, 2026
d9a6064
feat(threats): add axios-2026, shai-hulud-2025, teampcp-2026 + phanto…
miccy Apr 22, 2026
d8edb72
feat(docs): make search cancel button layout flexible
miccy Apr 28, 2026
e80f2c7
🧹 [remove debug logs from test-loader]
miccy Apr 28, 2026
86e9f48
🔒 [security] replace Date.now() with randomUUID() for STIX bundle IDs
miccy Apr 28, 2026
d84750e
test: add unit tests for osvToThreatProfile
miccy Apr 28, 2026
d4c0826
fix(remediation): implement unused dryRun in safeSuspend
miccy May 4, 2026
c1078d9
Merge PR #16: remove debug logs from test-loader
miccy May 4, 2026
b805c55
Merge PR #23: fix unused dryRun parameter in safe-suspend
miccy May 4, 2026
4f50b4b
Merge PR #17: replace Date.now() with randomUUID() for STIX bundle IDs
miccy May 4, 2026
001710d
Merge PR #18: add tests for osvToThreatProfile
miccy May 4, 2026
0d7eb62
Merge PR #15: make search cancel button layout flexible
miccy May 4, 2026
0989ec4
feat(scanner): add path traversal validation (from PR #22)
miccy May 4, 2026
449660a
feat: add path traversal and absolute path validation to path utility…
miccy May 4, 2026
79c4fb6
feat!: resolve all CodeRabbit review issues and bump to v2.0.0
miccy May 4, 2026
8255201
chore: update all dependencies and rename dont-be-shy-hulud → wormsCTRL
miccy May 4, 2026
3316176
fix: correct script paths in docs and expand loopback SSRF check
miccy May 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# ===================
# ownspace
# ===================
_*
_ref
_knowledge
_skeletons

# ===================
# OS
Expand Down Expand Up @@ -71,6 +73,8 @@ credentials.json
# Tests
# ===================
test-results/
playwright-report/
apps/docs/playwright-report/

# ===================
# Agents
Expand Down
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ See `.agents/README.md` for usage instructions.

## Version Information

- **Repository**: https://github.com/miccy/dont-be-shy-hulud
- **Repository**: https://github.com/miccy/wormsCTRL
- **License**: MIT
- **Maintainer**: @miccy
- **Status**: Active development (public release, seeking contributors)
Expand Down
73 changes: 73 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,82 @@
# Changelog
<!-- markdownlint-disable MD024 -->

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [2.0.0] - 2026-05-04

### ⚠️ Breaking Changes

- **Stricter threat schema validation** — `sha256` fields in `file_artifacts` now require a valid 64-character hex hash or `null` (empty strings are rejected). IOC and remediation string arrays now reject empty strings.
- **pnpm parser output** — `resolved` and `integrity` are now separate fields (previously `integrity` was conflated into `resolved`).

### Added

- **SSRF protection** — `ingest.ts` now blocks fetches to private/internal networks, adds fetch timeouts (15s), and enforces response size limits (2 MB).
- **Alias descriptor parsing** — Bun and pnpm parsers now correctly handle npm alias descriptors (`alias@npm:real@1.2.3`) by extracting the alias name.
- **Per-entry threat loading** — Both `packages/ioc/index.js` and `packages/scanner/src/threats.ts` now load threat files individually, so one malformed JSON file doesn't drop the entire catalog.
- **Threat shape validation** — `readThreatObject()` in the scanner now validates the parsed JSON shape before indexing, preventing crashes on malformed threat files.
- **Dynamic test discovery** — `threats.validate.test.ts` now validates all JSON files in the threats directory automatically instead of a hardcoded list.
- **Path traversal security** — Added `validatePath()` utility to scanner with null byte and empty path rejection.

### Fixed

- **Bun lockfile deduplication** — Scanner now prefers `bun.lock` over `bun.lockb` when both exist, preventing duplicate findings.
- **npm parser** — Skips workspace/link entries and root package (`""` key) in v3 lockfiles.
- **Threat loader resilience** — `readdirSync` wrapped in try/catch to handle TOCTOU race after `existsSync`.
- **`toFindingSeverity` switch** — Added default case to prevent `undefined` return for unexpected severity values.
- **CLI output error handling** — `writeFileSync` failures now return a user-friendly error message instead of an unhandled exception.
- **Empty hash indicators** — Replaced `sha256: ""` with `sha256: null` in `event-stream-2018`, `ctx-2022`, and `xz-utils-2024` threat profiles.
- **Node ESM compatibility** — `packages/ioc/index.js` now uses `fs.readFileSync` for JSON loading instead of bare import assertions.
- **Playwright config** — Added `stdout`/`stderr` pipe configuration for better CI debugging.
- **Redundant wrapper** — Removed `parseNpmLock()` wrapper in `scan.ts` (calls `parseNpmLockfile()` directly).
- **Top-level await** — Added documentation comment explaining intentional eager threat database initialization.

### Changed

- **Version bump** — `1.5.2` → `2.0.0`

### Added

- **Grant-ready threat catalog** — Added structured threat objects for `event-stream`, `node-ipc`, `ua-parser-js`, `ctx`, and `xz-utils` under `packages/ioc/threats/`.
- **Threat catalog expansion** — Added `axios-2026`, `shai-hulud-2025`, and `teampcp-2026` threat objects plus fixture data for npm and PyPI compromise scenarios.
- **AI ingestion skeleton** — Added `packages/engine/src/ingest.ts`, `prompt.ts`, and `validate.ts` for JSON-mode threat extraction with graceful OpenAI fallback behavior and Zod validation.
- **Scanner validation flow** — Added scanner and engine validation plumbing for npm lock parsing, injection findings, and schema-based threat extraction support.
- **Threat regression tests** — Added fixture-based `bun:test` coverage for malicious version matching, phantom dependency detection, threat schema validation, and clean baseline scans.
- **Grant demo artifact** — Added a SARIF demo output for the axios compromise fixture under `examples/axios-compromise.sarif`.

### Security

- **Report Sanitization** — Implemented a centralized `redactSensitiveData` function in the Playwright HTML report to strip emails, tokens, secrets, and local paths from AI prompts.
- **Payload Removal** — Removed the embedded base64 ZIP payload from the HTML report to prevent secret leakage and encourage CI artifact usage.

### Fixed

- **E2E Stability** — Corrected invalid Playwright assertions in `apps/docs` to use `toBeVisible()` and removed silent failures in table rendering tests.
- **Git Hygiene** — Added `playwright-report/` to `.gitignore` to prevent committing generated test artifacts.

### Changed

- **Lockfile coverage** — Completed pnpm and Bun parser support and wired both into scanner dispatch.
- **Threat-aware detection** — Scanner now cross-references `packages/ioc/threats/*.json` for known malicious versions and only emits injection findings for known phantom dependency IOCs.
- **Python package coverage** — Added basic `requirements.txt` parsing so PyPI threat entries can be matched by the scanner.
- **CLI flow** — Reworked the main CLI entry point to support `--format`, `--output`, and `--threats`, and to run through the scanner package end-to-end.
- **Documentation** — Rewrote the root README and added `cs/README.md` with grant-focused positioning, AI architecture, quick start, and threat database coverage.
- **Threat schema** — Upgraded `ThreatObject` to use structured IOC and reference objects, expanded ecosystem coverage for Linux/system incidents, and migrated the bundled threat catalog to the new format.
- **CLI packaging** — Added a self-contained CLI bundle build with bundled threat catalog assets so published installs no longer depend on monorepo-only source paths.

- **E2E Test Robustness** — Updated documentation tests to assert element visibility and existence unconditionally, preventing silent regressions in table rendering and page navigation.
- **Threat Database Accuracy** — Added missing `oneday-test` malicious version and updated empty SHA256 fields to `null` with explanatory notes in `node-ipc-2022` and `ua-parser-js-2021` threat profiles.
- **IOC helper consistency** — Made archived threat/profile helper lookups return synchronously instead of mixing plain objects with dynamic-import promises.
- **Parser resilience** — Fixed scoped `pnpm` key parsing, restored `bun.lockb` fallback after text-lock read failures, and added safer `bun pm ls` timeout/error handling.
- **Scanner output** — Fixed the `low` severity summary line formatting, removed duplicate verbose location output, and reject empty `--output=` CLI values with a clear error.
- **Regression coverage** — Added tests for structured threat validation, publish-safe CLI argument parsing, pnpm/Bun parser edge cases, text formatter output, and synchronous IOC helper returns.

## [1.5.2] - 2026-04-21

### Changed
Expand All @@ -21,6 +93,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **Scanner Output** — Symmetrized naming of formatters (`formatJson`, `formatText`, `formatSarif`) and fixed broken exports in `@worms-ctrl/scanner`.
- **KB Engine** — Fixed asynchronous generator logic in `chunker.ts` and removed dead exports from `@worms-ctrl/kb`.
- **Process Suspension Safety** — Fixed remediation playbook phase mapping and type definitions for safe malware containment.
- **`safe-suspend` Parameter Fix** — Removed unused `_dryRun` parameter and implemented its logic in `packages/remediation/src/scripts/safe-suspend.ts`.

### Added

Expand Down
108 changes: 64 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,70 +1,90 @@
# 🪱 worms-ctrl
# wormsCTRL

![worms-ctrl Banner](packages/assets/banner.png)

> **Universal Supply Chain Audit Tool & Threat Knowledge Base**
> Defending against registry-native worms, malicious packages, and CI/CD compromises.
> Universal Supply Chain Audit Tool & Threat Knowledge Base for npm and adjacent open-source ecosystems.

[![npm version](https://img.shields.io/npm/v/@worms-ctrl/core?color=cb3837&logo=npm)](https://www.npmjs.com/package/@worms-ctrl/core)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)
[![Open Source](https://img.shields.io/badge/open%20source-public%20benefit-blue)](https://github.com/miccy/wormsCTRL)
[![Bun](https://img.shields.io/badge/runtime-Bun-black)](https://bun.sh)

## 🛡️ The Future of Supply Chain Defense
Supply-chain attacks are growing because modern applications inherit trust from thousands of transitive packages, maintainer accounts, CI runners, and release pipelines they do not directly control. A single compromised publisher, malicious install script, or poisoned upstream artifact can now reach thousands of downstream builds in hours, which leaves solo developers and small teams with enterprise-grade risk but without enterprise-grade detection coverage. `wormsCTRL` exists to close that gap with auditable lockfile scanning, structured threat intelligence, and an AI-assisted ingestion pipeline that turns public advisories into machine-readable defensive knowledge.

The evolution of software supply chain attacks has reached a critical point. Incidents like the [Shai-Hulud 2.0 npm worm](packages/ioc/archived/shai-hulud/) or the TeamPCP CI/CD compromise demonstrate the rise of self-propagating malware that leverages the trust of package registries to exfiltrate sensitive data.
## Architecture

**worms-ctrl** is an automated auditing tool and comprehensive Knowledge Base designed to:
1. **Act as an Incident Response Tool:** Providing immediate audit capabilities and remediation steps when under attack.
2. **Serve as a Threat Database:** Documenting historical threats (like Shai-Hulud) with exact Indicators of Compromise (IoCs).
3. **Be AI-Agent Ready:** Exposing machine-readable Threat Models (JSON/YAML) that Autonomous Security Agents can consume to update defense mechanisms in real-time.
```mermaid
flowchart LR
A["Scanner<br/>packages/scanner"] --> B["Engine<br/>packages/engine"]
B --> C["Threat KB<br/>packages/ioc + packages/kb"]
C --> D["CLI<br/>apps/cli"]
```

---
- `Scanner` parses lockfiles from npm, Yarn, pnpm, and Bun to surface suspicious packages.
- `Engine` converts advisories and blog posts into structured threat objects.
- `Threat KB` stores documented incidents as JSON and exposes them for automation and RAG.
- `CLI` makes the workflow usable in local dev, CI, and incident-response contexts.

## Quick Start
## Quick Start

```bash
# Install the universal scanner globally
npm install -g @worms-ctrl/cli
bun install
npx worms-ctrl scan .
```

# Run an audit against your current project using all known threat definitions
npx worms-ctrl scan
Optional formats:

# View the Knowledge Base of tracked threats
npx worms-ctrl threats
```bash
npx worms-ctrl scan . --format json
npx worms-ctrl scan . --format sarif --output wormsctrl.sarif
npx worms-ctrl scan . --threats
```

## 🧠 Threat Knowledge Base Architecture
## AI Integration

worms-ctrl treats threats as structured data. Inside `packages/ioc/`, you will find JSON representations of known supply chain attacks.
`wormsCTRL` treats AI as a defensive extraction layer, not as an opaque security oracle.

### Example: The Shai-Hulud Threat Object
When a new threat is detected via our intelligence feeds (e.g., Socket.dev, OSV, Phylum), an AI Agent can automatically generate a Threat Profile:
- `packages/engine/src/ingest.ts` accepts either raw advisory text or a blog/advisory URL.
- The engine sends the material to the OpenAI API using a constrained JSON-mode extraction prompt.
- The extracted payload is validated with Zod in `packages/engine/src/validate.ts`.
- Only schema-valid threat objects are returned for downstream storage or review.
- If `OPENAI_API_KEY` is not set, ingestion fails safely and returns `null` instead of throwing.

```json
{
"id": "shai-hulud-2.0",
"name": "Shai-Hulud 2.0",
"ecosystem": "npm",
"severity": "CRITICAL",
"status": "ARCHIVED",
"description": "A destructive npm supply-chain worm targeting developers and CI/CD pipelines."
}
Example environment:

```bash
export OPENAI_API_KEY=your_key_here
export OPENAI_MODEL=gpt-4o-mini
```

## 🗺️ Roadmap & Integration
- [x] Refactor core architecture from static scripts to dynamic JSON Threat Object ingestion.
- [x] Archive Shai-Hulud 2.0 as the first documented threat.
- [ ] Integrate real-time webhook ingestion from Socket.dev & Phylum APIs.
- [ ] Launch the `wormsCTRL` public Knowledge Base web portal.
- [ ] Implement AI Agent workflow for automatic Threat Object generation from Twitter/Mastodon threat intel.
## Threat Database

Current documented entries include:

| Threat ID | Ecosystem | Severity | Attack Vector | Summary |
| --- | --- | --- | --- | --- |
| `event-stream-2018` | npm | ![HIGH](https://img.shields.io/badge/severity-HIGH-orange) | maintainer compromise | Introduced `flatmap-stream` to target Copay wallet builds. |
| `node-ipc-2022` | npm | ![HIGH](https://img.shields.io/badge/severity-HIGH-orange) | protestware | Overwrote files and dropped `WITH-LOVE-FROM-AMERICA.txt`. |
| `ua-parser-js-2021` | npm | ![CRITICAL](https://img.shields.io/badge/severity-CRITICAL-red) | maintainer account hijack | Delivered credential theft and crypto-mining payloads. |
| `ctx-2022` | pypi | ![HIGH](https://img.shields.io/badge/severity-HIGH-orange) | account takeover | Exfiltrated environment variables to a Heroku endpoint. |
| `xz-utils-2024` | linux | ![CRITICAL](https://img.shields.io/badge/severity-CRITICAL-red) | upstream release compromise | Backdoored `liblzma` via malicious upstream tarballs. |
| `shai-hulud-2025` | npm | ![CRITICAL](https://img.shields.io/badge/severity-CRITICAL-red) | self-replicating registry worm | Injected `bundle.js`, scanned for credentials, and republished infected packages. |
| `axios-2026` | npm | ![CRITICAL](https://img.shields.io/badge/severity-CRITICAL-red) | maintainer account compromise + phantom dependency | Published malicious axios releases plus `plain-crypto-js` RAT delivery. |
| `teampcp-2026` | pypi | ![HIGH](https://img.shields.io/badge/severity-HIGH-orange) | stolen OIDC Trusted Publisher token + direct registry push | Pushed malicious `litellm` and `telnyx` releases via compromised publishing identity. |

Threat records live in [`packages/ioc/threats`](packages/ioc/threats) and are designed to be both human-readable and automation-friendly.

## What Ships Today

---
- Lockfile parsing for `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`, `bun.lock`, `bun.lockb`, and basic `requirements.txt` pins.
- Injection detection with text, JSON, and SARIF output modes.
- Structured incident entries for major real-world supply-chain attacks.
- AI ingestion skeleton for converting advisories into reusable threat objects.
- Implemented parser logic, injection finding generation, and schema validation for threat objects.

## 🤝 Contributing
We welcome contributions from security researchers! If you've analyzed a new malicious package campaign, please submit a PR adding a new Threat Object to our `packages/ioc` directory.
## Grant Context

See [CONTRIBUTING.md](CONTRIBUTING.md) for details.
Built as part of an OpenAI Cybersecurity Grant Program submission. Goal: democratize supply chain defense for solo developers and small teams.

---
## License

*Formerly known as dont-be-shy-hulud.*
MIT. This repository is intentionally kept permissive and public for defensive reuse, research, and community contribution.
2 changes: 1 addition & 1 deletion SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Instead, please report them via one of the following methods:

### Private Security Advisory (Preferred)

1. Go to the [Security Advisories page](https://github.com/miccy/dont-be-shy-hulud/security/advisories)
1. Go to the [Security Advisories page](https://github.com/miccy/wormsCTRL/security/advisories)
2. Click "New draft security advisory"
3. Fill in the details

Expand Down
Loading
Loading