Skip to content

fix(telemetry): three delivery bugs + Realtime model in feature_used#167

Merged
FrancescoRosciano merged 1 commit into
mainfrom
fix/telemetry-hardening
Jun 11, 2026
Merged

fix(telemetry): three delivery bugs + Realtime model in feature_used#167
FrancescoRosciano merged 1 commit into
mainfrom
fix/telemetry-hardening

Conversation

@FrancescoRosciano

Copy link
Copy Markdown
Collaborator

Summary

E2E testing of the published 0.6.6 packages against a real local collector found that several telemetry events record but never deliver. Three root-cause fixes (both SDKs, per sdk-parity) plus one small capture gap closed:

Fixed

  1. GC lossTelemetryClient(...).record(...) without a held reference (the CLI's cli_command pattern) was garbage-collected with its buffered events before the atexit/beforeExit flush. The registry is a WeakSet by design; clients now hold a strong module-level ref (_PENDING_FLUSH/pendingFlush) from first buffered event until the buffer drains.
  2. In-flight close lossaclose()/close() flushed an already-empty buffer while the POST started by record() was still in flight; prompt shutdown killed it mid-air. Close now awaits the in-flight flush first.
  3. In-flight stranding — events recorded while a flush POST was in flight sat in the buffer with no flush scheduled (constructor's first_run POST shadowed sdk_initialized + all agent-time events). A completed flush now chains another when the buffer is non-empty.

Added

  • Realtime model in feature_usedllm_model: "openai-gpt-realtime-2" etc. via the existing sanitized llm_model dimension (no schema bump; custom names still collapse to openai-other). Dedupe key includes the model. Docs table updated.

Test plan

  • TDD: each bug reproduced by a failing test first (RED→GREEN), in both suites
  • scripts/pr-validate.sh: Python tests (full suite) + security tests + TS lint/tests/build — all green
  • Python 51 passed, TS 39 passed (telemetry suites)

Notes

  • getpatter hermes / getpatter openclaw CLI commands never emitted cli_command (only dashboard/eval/other do) — left as-is, worth a follow-up.
  • Collector deploy (telemetry.getpatter.com) tracked separately — DNS still unattached.

Delivery fixes (both SDKs), each found by E2E-testing the published 0.6.6
artifacts against a real local collector:

1. GC loss: a TelemetryClient constructed fire-and-forget (the CLI's
   cli_command pattern) was garbage-collected with its buffered events
   before the exit flush ran — the registry is a WeakSet by design. Clients
   now hold a strong module-level reference (_PENDING_FLUSH / pendingFlush)
   from first buffered event until the buffer drains.

2. In-flight close loss: aclose()/close() flushed an already-empty buffer
   while the POST started by record() was still in flight, killing it
   mid-air on prompt shutdown. Close now awaits the in-flight flush first.

3. In-flight stranding: events recorded while a flush POST was in flight
   stayed buffered with no flush scheduled (record() saw the live task and
   skipped). A completed flush now chains another when the buffer is
   non-empty — constructor events can no longer shadow agent-time events.

Added: Realtime agents now report their model variant in feature_used via
the existing sanitized llm_model dimension (openai-gpt-realtime-2, ...;
custom names still collapse to openai-other). The dedupe key includes the
model so two agents on different Realtime models both record. No schema
change — llm_model was already an allowlisted string dimension.

Tests: regression tests for all three delivery bugs + realtime-model
capture, in both suites (Python 51 passed, TS 39 passed).
@mintlify

mintlify Bot commented Jun 11, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
patter-06b046ce 🟢 Ready View Preview Jun 11, 2026, 1:42 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@FrancescoRosciano FrancescoRosciano merged commit b94f105 into main Jun 11, 2026
10 checks passed
@FrancescoRosciano FrancescoRosciano mentioned this pull request Jun 11, 2026
3 tasks
FrancescoRosciano added a commit that referenced this pull request Jun 11, 2026
Bump getpatter to 0.6.7 across all four version files and roll the CHANGELOG
"## Unreleased" block into "## 0.6.7 (2026-06-10)".

Release contents (this PR): the telemetry delivery fix wave (#167) — three
root-cause fixes so fire-and-forget events actually arrive (WeakSet GC loss
of unreferenced clients, close() killing the in-flight POST, in-flight flush
stranding subsequent events) — plus the Realtime model variant in
feature_used (llm_model: openai-gpt-realtime-2 / -mini / ...).
@FrancescoRosciano FrancescoRosciano deleted the fix/telemetry-hardening branch June 11, 2026 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant