Skip to content

Integrate NetworkConsistencyChecker into CrawlerService post-crawl pipeline #2167

@bokelley

Description

@bokelley

Context

The NetworkConsistencyChecker in @adcp/client (adcontextprotocol/adcp-client#539) provides pure validation logic that detects five failure modes for managed publisher networks: orphaned pointers, stale pointers, missing pointers, schema errors, and unreachable agent endpoints.

CrawlerService (server/src/crawler.ts) already follows an event-driven post-crawl pattern: crawl → snapshot → diff → emit events → notify. The consistency checker should slot into this pipeline so that network health issues are detected reactively during scheduled crawls, not via polling.

Related: #2140 (managed network deployment guide), adcontextprotocol/adcp-client#536 (consistency checker issue), adcontextprotocol/adcp-client#539 (consistency checker PR)

Proposed changes

1. Run consistency checks as a post-crawl step

In CrawlerService.crawlAllAgents(), after populateFederatedIndex() and produceEventsFromDiff():

  • Identify authoritative URLs seen during the crawl (from authoritative_location fields)
  • For each authoritative URL, run NetworkConsistencyChecker.check() with the domains that reference it
  • Diff the report against the previous stored report for that authoritative URL

2. Persist reports

Store consistency check reports in a new table (e.g. network_consistency_reports) with:

  • authoritative_url — the network's canonical file URL
  • report — the full JSON report from the checker
  • checked_at — timestamp
  • coverage — coverage percentage for quick queries
  • issue_count — total issues for quick queries

3. Emit events on state changes

Using the existing CatalogEventsDatabase pattern, emit events when:

  • A new failure is detected (orphaned pointer, stale pointer, missing pointer, schema error, agent down)
  • A previously failing domain recovers
  • Coverage drops below a threshold

Event types: network.orphaned_pointer, network.stale_pointer, network.missing_pointer, network.schema_error, network.agent_unreachable, network.domain_recovered, network.coverage_drop

4. Notify via existing infrastructure

Extend server/src/notifications/registry.ts to handle network consistency events:

  • Slack notifications on new failures (using existing REGISTRY_EDITS_CHANNEL_ID or a new channel)
  • Group notifications by authoritative URL (one message per network, not per domain)
  • Include actionable context: which domains, what failed, how to fix

5. Expose via API

Add endpoints for the admin dashboard (future):

  • GET /api/network-health/:authoritativeUrl — latest report for a network
  • GET /api/network-health/:authoritativeUrl/history — report history for trends
  • GET /api/network-health — summary across all tracked networks

Architecture boundary

The NetworkConsistencyChecker in @adcp/client returns a typed report. It does not persist, emit events, or notify. All of that belongs in CrawlerService and the server layer — same pattern as PropertyCrawler today.

PropertyCrawler.crawlAgents()           ← @adcp/client (returns data)
NetworkConsistencyChecker.check()       ← @adcp/client (returns report)
  → diff against previous report        ← CrawlerService (our code)
  → persist to DB                       ← CrawlerService
  → emit events on new failures         ← CrawlerService
  → notify via Slack                    ← notifications/registry.ts

Who this helps

  • Network operators get notified of deployment issues without checking a dashboard
  • The registry detects trust gaps (orphaned/stale pointers) automatically
  • Foundation for the network health dashboard (separate issue)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions