metrics: emit goodput track as a point attribute so the bandit evaluator can slice it#675
Merged
Merged
Conversation
…tor can slice it proxy.session.goodput tagged track only via the OTEL resource (semconv.ProxyTrackKey = "proxy.track"). SigNoz's metrics pipeline doesn't expose resource attributes as queryable labels, so the bandit experiment evaluator — which groups/filters goodput by the point attribute "track" (matching lantern-box) — saw zero samples for every http-proxy track: a busy http-proxy incumbent returned 0 for track='<name>' while a sing-box track returned data. Every http-proxy challenger/control arm therefore read 0 goodput, so no http-proxy experiment could ever reach a verdict. Emit track as a point attribute keyed "track" on the goodput histogram (threaded through NewDefault from the proxy's -track flag). The resource keeps proxy.track for everything else. cloud.region is intentionally left a resource attr: the evaluator's strata are (track, country) and a challenger is pinned to one DC, so region would add cardinality without decision value.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthrough
ChangesTrack attribute in goodput instrumentation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
proxy.session.goodputnow carriestrackas a point (datapoint) attribute keyedtrack, in addition to the existingproxy.trackresource attribute.Why
The goodput histogram (#674) tagged
trackonly via the OTEL resource (semconv.ProxyTrackKey = "proxy.track"). SigNoz's metrics pipeline doesn't expose resource attributes as queryable metric labels, and the bandit experiment evaluator (lantern-cloudcmd/api/experiment/signoz.go) groups and filters goodput by the point attribute literally namedtrack— the way lantern-box emits it.Result: the evaluator's
track IN [challenger, control]query matched sing-box/lantern-box tracks but never http-proxy tracks. Verified on prod:track = 'hysteria2-oci-free-vps'(sing-box) → goodput datatrack = 'blastoise-vps'(busy http-proxy incumbent, 21 routes, 24h) → 0; all http-proxy goodput sat in an untaggedtrack=""bucket.So every http-proxy challenger and http-proxy control arm read 0 goodput, and no http-proxy-backed bandit experiment could ever reach a verdict (it would hold until the 14-day max-gathering cutoff and retire inconclusive). This blocks the eng#3651 market-pivot experiments, whose challengers are http-proxy (
tls_*) tracks.Change
instrument.SessionGoodputemitsattribute.String("track", ins.track)as a point attribute alongsidegeo.country.iso_codeandnetwork.io.direction.trackis threaded intodefaultInstrumentviaNewDefaultfrom the proxy's existing-trackflag (p.Track).proxy.trackfor all other signals — unchanged.cloud.regionis intentionally left a resource attribute: the evaluator's strata are(track, country)and a challenger is pinned to one DC, so region as a point attr would add cardinality with no decision value.Tests
go build ./instrument/...+go vetclean;go test ./instrument/ -run TestSessionGoodputpasses. ExtendedTestSessionGoodputto assert thetrackpoint attribute is present and equals the configured track. (Repo-widego build ./...fails only on the pre-existinggo-libutpCGO/compiler incompatibility, unrelated to this change.)Rollout note
After merge: rebuild the http-proxy image and bump
bandit_vps_http_proxy_default_short_tagin lantern-cloud so new VPS provisions emit track-tagged goodput. Existing experiment challengers were provisioned on the prior image; they'll pick up the new image on VPS autoreplace/recreate (or just let the current 4 retire and let post-rollout experiments be the measurable ones).Summary by CodeRabbit
New Features
Tests