feat: measure first-seen vs duplicate-arrival lag per source pair (dup_lag) by McSim85 · Pull Request #8 · quiknode-labs/shredstream-proxy

McSim85 · 2026-06-11T01:00:16Z

Is this change upstreamable?

Yes — broadly useful (generic per-source duplicate-lag measurement, no QN-specific naming in the proxy; same spirit as the receiver_stats labels offered upstream as Add listen_port and device tags to receiver_stats metric jito-labs/shredstream-proxy#70). After merge into main-qn, cherry-pick onto a branch off master for an upstream PR.

Summary

Phase 1 of the shred-provider freshness program — the missing number from the DoubleZero fleet analysis (open question #1): receiver_stats shows which source wins each dedup race; this measures by how much, continuously, per (winner, loser) source pair.

DupLagTracker: first-seen map = two time-rotated DashMap shards keyed by ahash of the raw packet bytes (same bytes the deduper hashes → winner/duplicate agree with dedup by construction; no new deps). Rotation bounds retention to [ttl, 2·ttl) with no sweeper thread — single-digit MB at the fleet's measured ≤35k pps with the 1s default TTL.
Lifetime-cumulative histogram per (winner_addr, winner_port, loser_addr, loser_port): 13 bands 50µs…500ms (the analysis' stated unknown range) + +Inf/count/sum.
Emitted each metrics interval as a Prometheus-shaped histogram: measurement shredstream_proxy-dup_lag_seconds with le tag (fractional seconds) + cumulative bucket field, plus count/sum → VM shredstream_proxy_dup_lag_seconds_bucket{le,…}/_count/_sum after Telegraf's -→_ sanitization. Dash-prefix kept deliberately: the QN Telegraf provider-map and Anza namedrop both key on shredstream_proxy-*.
Hook placement note (differs from the original sketch): the observe pass runs before dedup_packets_and_count_discards, because Packet::data() returns None once a packet is marked discard — and the duplicates are exactly what must be observed. The tracker decides winner/duplicate from its own map; dedup behavior and forwarding are untouched.
Hash-based sampling (--dup-lag-sample-rate, power-of-two, default 1 = all) so both arrivals of a sampled shred are observed. Off by default (--measure-dup-lag); disabled cost = one Option check per batch.

Validation

cargo test --all-features: 21/21 incl. 4 new tests (pair attribution + exact bucket placement, +Inf overflow, rotation eviction, seeded-hash sampling consistency).
cargo clippy --all-features --all-targets --tests -- -D warnings: clean.
Memory math: ≤35k pps × 2×1s ≈ ≤70k entries ≈ single-digit MB; histogram keys are sparse real pairs only.

Rollout (phase-gated, after merge + release-please tag)

infra-k8s-apps: fix the VM-output namepass entry shredstream_proxy-dup_lag → shredstream_proxy-dup_lag_seconds (the provider-map/namedrop globs already match).
role_solana: add --measure-dup-lag to the unit template (canary NRT first).
Verify in VM: histogram_quantile(0.95, sum by (le, winner_provider, loser_provider) (rate(shredstream_proxy_dup_lag_seconds_bucket[$__rate_interval]))) in plausible µs–ms range; _count rate ≈ connection_metrics.duplicate rate.

Note

Medium Risk
Adds per-packet work and bounded in-memory state on the forwarder hot path when enabled; default-off limits production impact.

Overview
Adds optional duplicate-arrival lag measurement so operators can see how much earlier one shred source wins vs another, not just who wins dedup races.

When --measure-dup-lag is on, a new DupLagTracker hashes raw packet bytes, keeps a TTL-bounded two-shard first-seen map, and records lag into lifetime-cumulative histograms keyed by (winner_addr, winner_port, loser_addr, loser_port). Observations run before dedup_packets_and_count_discards so duplicate payloads are still visible via Packet::data(). Forwarding and dedup behavior are unchanged; off by default the hot path only checks metrics.dup_lag.

CLI adds --measure-dup-lag, --dup-lag-ttl-ms (default 1s), and --dup-lag-sample-rate (power-of-two, hash-based 1-in-N). Each metrics interval emits Prometheus-style shredstream_proxy-dup_lag_seconds datapoints (cumulative buckets, count, sum) with winner/loser device tags; dup-lag histograms are not cleared on ShredMetrics::reset(). Seven unit tests cover pair attribution, overflow, rotation/eviction, idle gaps, and sampling.

^{Reviewed by Cursor Bugbot for commit d9e75a6. Configure here.}

…p_lag) Implements the missing latency dimension of the shred-provider value analysis ("DoubleZero Multicast Value", open question #1): receiver_stats already shows WHICH source wins each dedup race; this measures BY HOW MUCH, continuously, per (winner, loser) source pair. Design (proxy side; provider naming happens in the QN Telegraf pipeline): - DupLagTracker: first-seen map as two time-rotated DashMap shards keyed by ahash of the raw packet bytes (the same bytes the deduper hashes, so winner/duplicate agree with dedup by construction; ahash is already a dependency). Value = (Instant, source addr, listen port). Rotation drops the older shard, bounding retention to [ttl, 2*ttl) with no sweeper thread; at the fleet's measured <=35k pps and the 1s default TTL that is single-digit MB. - Lag histogram: lifetime-cumulative atomics per (winner_addr, winner_port, loser_addr, loser_port): 13 bands 50us..500ms + overflow + count + sum_micros. Bounds follow the fleet analysis' stated unknown range ("could be 50us, could be 500ms"). - Emission each metrics interval as Prometheus-histogram shape: measurement shredstream_proxy-dup_lag_seconds with le tag (fractional seconds, "+Inf") and cumulative `bucket` field, plus count/sum (seconds) fields -> VM series shredstream_proxy_dup_lag_seconds_bucket/_count/_sum after Telegraf's "-"->"_" sanitization. Cumulative counters (never reset; excluded from reset()) so rate()/histogram_quantile() aggregate across hosts and time. - Hook runs BEFORE dedup_packets_and_count_discards: Packet::data() returns None once a packet is marked discard, and the duplicates are exactly what must be observed. The tracker decides winner/duplicate from its own map, so dedup behavior and forwarding are untouched. - Hash-based sampling (--dup-lag-sample-rate, power of two, default 1): hash & mask selection keeps both arrivals of a sampled shred observed, which random sampling would break. - Off by default (--measure-dup-lag); disabled cost is one Option check per batch. New args (CommonArgs): --measure-dup-lag, --dup-lag-ttl-ms (default 1000), --dup-lag-sample-rate (validated power-of-two at startup). Tests: winner/loser pair attribution + exact bucket placement (2.4ms -> (1ms, 2.5ms] band), +Inf overflow, rotation eviction (expired first-seen records nothing), and seeded-hash sampling consistency (sampled shreds record exactly once, unsampled never). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Bugbot review follow-ups: 1. Idle-gap staleness (real): maybe_rotate performed one shard clear per call while advancing the deadline by the whole elapsed gap, so after a pause > 2*ttl the untouched shard kept entries far older than the documented [ttl, 2*ttl) window; a late re-arrival could then record a multi-second "lag" into sum/+Inf. The rotation now also clears the current shard when the gap overshot a full extra period. Regression test: test_dup_lag_idle_gap_drops_both_shards. 2. Concurrent first-seen TOCTOU (the real bug behind the "same batch inverts winner" report): two threads observing the same shred could both miss the lookup and double-insert, dropping the race record and crowning the later arrival. The current-shard check-or-insert now uses DashMap's atomic entry(); the older shard only ever loses entries (inserts target cur), so its plain read stays race-free. The literal in-batch claim does not apply: recvmmsg batches preserve kernel arrival order, so the first-iterated packet is the true first arrival. cargo test 22/22; clippy --all-targets -D warnings clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

McSim85 · 2026-06-11T01:31:38Z

bugbot run

Bugbot round 2: with --dup-lag-sample-rate > 1, observe() returned before maybe_rotate for unsampled packets, making rotation cadence depend on sampled arrivals. The reported consequence (multi-second stale lag) was already prevented by the round-1 overshoot fix — the call that could match a stale entry rotates before its lookup, and a > ttl overshoot clears both shards — but the retention invariant should not be conditional on the sampling rate. maybe_rotate now runs before the sample check (one relaxed atomic load per packet when not due). New test: test_dup_lag_rotation_runs_on_unsampled_traffic (an unsampled arrival after a 5-period gap clears both shards). cargo test 23/23; clippy -D warnings clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

McSim85 · 2026-06-11T01:36:49Z

bugbot run

Bugbot round 3 (valid): a stale `cur` read straddling a rotation could insert the same payload hash into both shards — one observer targets the just-demoted shard, another the new one — forking first-seen. The atomic entry() from round 1 only serializes same-shard inserts. Practical impact was small (the racers are near-simultaneous arrivals whose true order is ambiguous, and the lag error is bounded by their inter-arrival gap), but the invariant is now structural instead of probabilistic: - rotation_lock: RwLock<()>. Observers take a read pin for the lookup+insert of sampled packets (~tens of ns uncontended); the CAS winner takes write for the flip+clear, once per ttl. A flip can no longer interleave with a pinned lookup+insert, so exactly one shard can own a given hash. No nesting (maybe_rotate's write always precedes the read pin on a thread; record_lag touches only the separate hist map) — no deadlock cycle. cargo test 23/23; clippy -D warnings clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

McSim85 · 2026-06-11T01:42:52Z

bugbot run

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit d9e75a6. Configure here.}

McSim85 · 2026-06-11T01:49:52Z

bugbot run

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit d9e75a6. Configure here.}

McSim85 · 2026-06-12T12:04:16Z

bugbot run verbose=true

cursor · 2026-06-12T12:04:18Z

Bugbot request id: serverGenReqId_9a088428-af80-4675-a900-cc726991cff8

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit d9e75a6. Configure here.}

cursor Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread proxy/src/forwarder.rs

Comment thread proxy/src/forwarder.rs

cursor Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread proxy/src/forwarder.rs

cursor Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread proxy/src/forwarder.rs

cursor Bot reviewed Jun 11, 2026

View reviewed changes

cursor Bot reviewed Jun 12, 2026

View reviewed changes

cp0k approved these changes Jun 12, 2026

View reviewed changes

McSim85 requested a review from cp0k June 12, 2026 13:06

cp0k approved these changes Jun 12, 2026

View reviewed changes

McSim85 requested a review from cp0k June 12, 2026 13:09

McSim85 merged commit eda6120 into main-qn Jun 12, 2026
3 checks passed

McSim85 deleted the feature/dup-lag-metric branch June 12, 2026 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: measure first-seen vs duplicate-arrival lag per source pair (dup_lag)#8

feat: measure first-seen vs duplicate-arrival lag per source pair (dup_lag)#8
McSim85 merged 4 commits into
main-qnfrom
feature/dup-lag-metric

McSim85 commented Jun 11, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

McSim85 commented Jun 11, 2026

Uh oh!

Uh oh!

McSim85 commented Jun 11, 2026

Uh oh!

Uh oh!

McSim85 commented Jun 11, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

McSim85 commented Jun 11, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

McSim85 commented Jun 12, 2026

Uh oh!

cursor Bot commented Jun 12, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

McSim85 commented Jun 11, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Is this change upstreamable?

Summary

Validation

Rollout (phase-gated, after merge + release-please tag)

Uh oh!

Uh oh!

Uh oh!

McSim85 commented Jun 11, 2026

Uh oh!

Uh oh!

McSim85 commented Jun 11, 2026

Uh oh!

Uh oh!

McSim85 commented Jun 11, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

McSim85 commented Jun 11, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

McSim85 commented Jun 12, 2026

Uh oh!

cursor Bot commented Jun 12, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

McSim85 commented Jun 11, 2026 •

edited by cursor Bot

Loading