Skip to content

Add WAF detection to identify bot-protection blocks#2

Merged
stevenkop-g merged 1 commit into
mainfrom
claude/validate-scan-failure-4s1E2
May 3, 2026
Merged

Add WAF detection to identify bot-protection blocks#2
stevenkop-g merged 1 commit into
mainfrom
claude/validate-scan-failure-4s1E2

Conversation

@stevenkop-g
Copy link
Copy Markdown
Contributor

Summary

This PR adds WAF (Web Application Firewall) detection capabilities to identify when HTTP responses are blocked by bot-protection or WAF services. The scanner now detects common WAF vendors and reports whether results may be unreliable due to blocking.

Key Changes

  • New WAF detector module (src/waf-detector.ts): Implements heuristic detection for major WAF vendors including Akamai, Cloudflare, AWS WAF, Sucuri, Imperva, and Fastly. Uses a conservative approach that only flags responses when the status code is in a typical blocking range (401, 403, 406, 429, 503) AND a recognizable WAF signature is present in headers or server identification.

  • Extended ScanResult type (src/types.ts): Added wafBlocked (boolean) and wafVendor (string | null) fields to track WAF detection results.

  • Integrated WAF detection into checker (src/checker.ts): Calls detectWaf() during header analysis and includes results in the returned ScanResult.

  • Updated formatters (src/formatters/table.ts and src/formatters/text.ts): Display WAF detection warnings when a block is detected, noting that results may be unreliable.

  • Comprehensive test coverage (tests/waf-detector.test.ts): 12 test cases covering successful detection of each WAF vendor, false negatives for non-blocking status codes, and edge cases.

Implementation Details

  • Detection uses a priority-based approach, checking for vendor-specific headers and server identifiers
  • Cloudflare detection is status-code aware (cf-ray only triggers on 403/429/503, not 200)
  • Generic 403/429 responses without vendor signatures are still flagged as blocked but with vendor: null
  • The implementation is conservative to avoid false positives on misconfigured legitimate sites

https://claude.ai/code/session_01KN9E9hQ4krRP16te1f6CSY

Sites like nu.nl actively block server-originated requests at the WAF
layer (Akamai returns HTTP 403 with x-blocked-by-waf). The library
previously graded those block pages as normal scans, leaving downstream
consumers no structured way to distinguish "site has bad headers" from
"scanner was blocked".

Adds a conservative detectWaf() heuristic (status-code + WAF signature
header check) for Akamai, Cloudflare, AWS WAF, Sucuri, Imperva and
Fastly. Two new fields on ScanResult (wafBlocked, wafVendor) flow
through the table and text formatters; JSON/CSV pick them up via
existing serialization. No User-Agent spoofing.
@stevenkop-g stevenkop-g merged commit f3257d4 into main May 3, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants