Add WAF detection to identify bot-protection blocks#2
Merged
Conversation
Sites like nu.nl actively block server-originated requests at the WAF layer (Akamai returns HTTP 403 with x-blocked-by-waf). The library previously graded those block pages as normal scans, leaving downstream consumers no structured way to distinguish "site has bad headers" from "scanner was blocked". Adds a conservative detectWaf() heuristic (status-code + WAF signature header check) for Akamai, Cloudflare, AWS WAF, Sucuri, Imperva and Fastly. Two new fields on ScanResult (wafBlocked, wafVendor) flow through the table and text formatters; JSON/CSV pick them up via existing serialization. No User-Agent spoofing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds WAF (Web Application Firewall) detection capabilities to identify when HTTP responses are blocked by bot-protection or WAF services. The scanner now detects common WAF vendors and reports whether results may be unreliable due to blocking.
Key Changes
New WAF detector module (
src/waf-detector.ts): Implements heuristic detection for major WAF vendors including Akamai, Cloudflare, AWS WAF, Sucuri, Imperva, and Fastly. Uses a conservative approach that only flags responses when the status code is in a typical blocking range (401, 403, 406, 429, 503) AND a recognizable WAF signature is present in headers or server identification.Extended ScanResult type (
src/types.ts): AddedwafBlocked(boolean) andwafVendor(string | null) fields to track WAF detection results.Integrated WAF detection into checker (
src/checker.ts): CallsdetectWaf()during header analysis and includes results in the returnedScanResult.Updated formatters (
src/formatters/table.tsandsrc/formatters/text.ts): Display WAF detection warnings when a block is detected, noting that results may be unreliable.Comprehensive test coverage (
tests/waf-detector.test.ts): 12 test cases covering successful detection of each WAF vendor, false negatives for non-blocking status codes, and edge cases.Implementation Details
vendor: nullhttps://claude.ai/code/session_01KN9E9hQ4krRP16te1f6CSY