Skip to content

Distinguish redirect loop and redirect chain timeout errors#114

Draft
heydemoura wants to merge 1 commit into
v2from
add/long-chained-redirect-detection
Draft

Distinguish redirect loop and redirect chain timeout errors#114
heydemoura wants to merge 1 commit into
v2from
add/long-chained-redirect-detection

Conversation

@heydemoura
Copy link
Copy Markdown
Contributor

Summary

Add detection for two distinct redirect failure modes that were previously conflated:

  1. Redirect Loop (ErrorRedirectLoop = 9): Detects when a redirect chain visits the same URL twice via URL deduplication. Maps to "redirect" status type.

  2. Redirect Chain Timeout (ErrorRedirectTimeout = 10): Detects when following a redirect chain exceeds the site's timeout limit (same as regular check timeout). Maps to "intermittent" status type.

Rationale

Long-running redirect chains present an exploitable attack surface for Jetmon's veriflier:

  • CPU exhaustion: Each veriflier goroutine following a redirect chain can consume significant CPU time
  • Resource starvation: A malicious site serving slow or infinite redirects can stall veriflier's check queue
  • Observability: Previously, redirect-induced timeouts were indistinguishable from regular network timeouts, making it harder to diagnose redirect-specific issues

Implementation Details

New Error Constants

  • ErrorRedirectLoop = 9 — redirect chain cycle detected
  • ErrorRedirectTimeout = 10 — redirect chain exceeded timeout

Detection Mechanism

  • Maintains a seenURLs map in the CheckRedirect hook
  • Tracks elapsed time since check start (checkStart := time.Now())
  • Returns sentinel errors (errRedirectLoop, errRedirectChainTimeout) that survive *url.Error wrapping

Error Classification

Checks for sentinel errors via errors.Is() before generic checks:

  1. Loop detection (specific)
  2. Chain timeout (specific)
  3. Context timeout (generic)
  4. Substring heuristics (fallback)

Time Limit

Uses the same timeout as regular checksNET_COMMS_TIMEOUT or per-site TimeoutSeconds. No new config key required.

Test Coverage

  • TestCheckRedirectLoop: Verifies A→B→A cycle detection
  • TestCheckRedirectChainTimeout: Verifies slow redirects timeout correctly
  • TestResultStatusType: Updated to test both new error codes

Metrics

New StatsD counter: scheduler.page.check.redirect_timeout.count

Both loop and timeout cases are tracked separately in orchestrator for debugging.

Files Changed

  • internal/checker/checker.go — error constants, sentinel errors, detection logic, error classification
  • internal/orchestrator/orchestrator.go — tracking, metrics, logging
  • internal/checker/checker_test.go — new test cases

🤖 Generated with Claude Code

Implement detection for two distinct redirect failure modes:
- ErrorRedirectLoop (9): when a redirect chain visits the same URL twice
- ErrorRedirectTimeout (10): when following redirects exceeds the site timeout

Both use the same timeout limit as regular checks (NET_COMMS_TIMEOUT or
per-site TimeoutSeconds) to prevent veriflier from getting stuck on
exploitable long-redirect-chain attacks that would consume excessive CPU time.

The redirect chain timeout is identified separately from ErrorTimeout to
distinguish between regular connection timeouts and redirect-induced timeouts,
improving observability and incident diagnosis.

New metrics: scheduler.page.check.redirect_timeout.count

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@heydemoura heydemoura changed the base branch from master to v2 May 14, 2026 18:16
@heydemoura heydemoura self-assigned this May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant