Skip to content

Proj/analytics system tool mode#726

Open
bao-byterover wants to merge 148 commits into
mainfrom
proj/analytics-system-tool-mode
Open

Proj/analytics system tool mode#726
bao-byterover wants to merge 148 commits into
mainfrom
proj/analytics-system-tool-mode

Conversation

@bao-byterover
Copy link
Copy Markdown
Collaborator

Summary

  • Problem: ByteRover CLI had no first-party analytics pipeline. Daemon
    lifecycle, CLI invocations, MCP sessions, curate/query funnels,
    HITL review, settings, version-control, and onboarding signals were
    all invisible to product / on-call.
  • Why it matters: without a funnel for usage and a failure-mode signal
    for the daemon, we cannot prioritise the next round of polish,
    catch silent regressions in agent flows, or measure feature
    adoption per release.
  • What changed: shipped the full analytics-system project, M1 through
    M15.8. New brv analytics enable | disable | status commands, a
    WebUI Configuration > Analytics panel, durable JSONL queue, batched
    HTTP sender with exponential backoff and identity-transition
    flushes, a 47-event catalogue with strict Zod schemas, a privacy
    fixture that statically rejects forbidden field names, daemon-side
    lifecycle hooks for tasks / curate / query, and producer-side
    wiring for daemon_start / cli_invocation / mcp_session_start /
    mcp_session_ended / mcp_tool_called.
  • What did NOT change (scope boundary): no analytics emit code lands
    under src/agent/ (layering rule preserved). No producer for
    onboarding_auto_setup_started / onboarding_completed yet,
    pending an agent-team signal (spec at
    agent-tmp/m15-8-onboarding-signal-spec.md). No space_switched
    producer (waits on the snapshot-removal project). The cli_metadata
    outgoing helper currently covers only 6 of around 80 oclif
    commands; the daemon middleware emits for all that participate.

Type of change

  • Bug fix
  • New feature
  • Refactor (no behavior change)
  • Documentation
  • Test
  • Chore (build, dependencies, CI)

Scope (select all touched areas)

  • TUI / REPL
  • Agent / Tools
  • LLM Providers
  • Server / Daemon
  • Shared (constants, types, transport events)
  • CLI Commands (oclif)
  • Hub / Connectors
  • Cloud Sync
  • CI/CD / Infra

Also touched: WebUI Configuration > Analytics panel + analytics API
(not in the template list).

Linked issues

  • Closes ENG-3001 (M15.8 deferred-events wiring)
  • Closes 38 prior sub-tickets across M1 through M15.7 of the
    analytics-system project. Representative anchors: ENG-2611 (analytics
    GlobalConfig field), ENG-2628 (analytics:track transport event),
    ENG-2686 (per-event schema catalogue), ENG-2649 (e2e harness),
    ENG-2772 (lifecycle-hook frontmatter harvest), ENG-2807 (M13.3
    cli_metadata sweep batches 1+2), ENG-2961 (M15.1 foundation),
    ENG-2967 (M15.4 HITL + settings + reset), ENG-2963 (M15.5 WebUI
    session lifecycle), ENG-2964 (M15.6 tool-mode analytics).
  • Related: agent-team follow-up for onboarding_auto_setup_started /
    onboarding_completed signal, spec at
    agent-tmp/m15-8-onboarding-signal-spec.md. Not filed yet.

Root cause (bug fixes only, otherwise write N/A)

  • Root cause: N/A. This is a feature project, not a bug fix.
  • Why this was not caught earlier: N/A.

Test plan

  • Coverage added:
    • Unit test
    • Integration test
    • Manual verification only (the e2e harness runs against
      dev-beta; manual smoke validates real-backend send + sent
      status in the JSONL queue)
  • Test file(s): 88 test files added or modified under
    test/unit/server/infra/analytics/,
    test/unit/shared/analytics/,
    test/unit/infra/process/,
    test/unit/infra/client/,
    test/integration/infra/process/analytics-hook-lifecycle-wiring.test.ts,
    test/e2e/analytics/dev-beta.e2e.ts, plus webui +
    oclif-command tests.
  • Key scenario(s) covered: JSONL queue durability with crash mid-write,
    bounded-queue drop-oldest semantics, HTTP sender 2xx + 5xx + 400
    • offline, exponential backoff with reachability counter,
      auth-transition force-flush, batched flush (30s tick + 20-event
      threshold + shutdown drain), wire-side per-event Zod validation,
      privacy fixture (no forbidden field name across any event),
      daemon lifecycle hook drains in-flight per-op processing
      before terminal emit, WebUI session start + end with reconnect
      orphan-end, MCP session start + end with handshake-timing race,
      mcp_tool_called for success + error + cancellation,
      cli_invocation middleware fires on cli_metadata-bearing requests
      and bails on malformed metadata, idempotent middleware attach.

User-visible changes

  • New commands: brv analytics enable, brv analytics disable,
    brv analytics status (with --format json for scripting).
  • New WebUI panel under Configuration > General > Analytics:
    enable / disable toggle with first-run disclosure dialogue,
    status line surfacing queue depth + last successful flush +
    backoff state.
  • First-run disclosure: clear opt-in prompt. Default is OFF; no data
    is collected unless the user accepts.
  • brv login and 5 other oclif commands now send cli_metadata so
    the daemon can emit cli_invocation per command. No prompt
    difference to the user.
  • No change to existing commands' output, exit codes, or behavior
    when analytics is disabled.

Evidence

  • Failing test before, passing after: see per-milestone commits;
    M15.5 added 75 ClientManager session tests; M15.6 added the
    lifecycle-wiring integration test; M15.8 added 31 new cases
    (client-manager-mcp-analytics + analytics-hook-mcp-tool-called +
    task-router-client-identity-snapshot + cli-invocation-middleware).
  • npm test on the merge head: 9726 passing, 16 pending, 0 failing (run during M15.8 verification).
  • git grep "from.*shared/analytics/emit" src/oclif src/agent src/server/infra/mcp returns empty (daemon-only instrumentation
    rule holds).
  • Privacy fixture: green. Every event schema in ALL_EVENT_SCHEMAS
    rejects every name on FORBIDDEN_FIELD_NAMES.

Checklist

  • Tests added or updated and passing (npm test): 9726 / 16
    pending / 0 failing as of the M15.8 verification step.
  • Lint passes (npm run lint): 0 errors, 279 warnings
    (warnings are pre-existing repo noise, none introduced).
  • Type check passes (npm run typecheck): root TS check passes;
    the WebUI sub-project check (tsc -p src/webui/tsconfig.json)
    reports 6 errors in 4 files about TopicViewerProps missing
    breadcrumb / index / related. These pre-date this branch
    (confirmed by git stash rerun on base) and originate from a
    separate adopt-callback-based-topic-viewer commit waiting on a
    @campfirein/byterover-packages bump. Flagged under Risks
    below; not a regression introduced by this PR.
  • Build succeeds (npm run build): not re-verified in this PR
    session because build:ui is gated on the same WebUI
    typecheck above. Verify after the topic-viewer package bump
    lands.
  • Commits follow Conventional Commits format: every commit on the
    branch uses feat: [ENG-XXXX] ... or fix: [ENG-XXXX] ....
  • Documentation updated (if applicable): CLAUDE.md was kept in
    sync as milestones landed (analytics commands, env vars,
    e2e instructions, settings keys, billing-line). Spot-check
    before merging.
  • No breaking changes (or clearly documented above): opt-in,
    default-off feature. No public command shape changed. The
    IAnalyticsClient interface is daemon-internal.
  • Branch is up to date with main: last merge from main was
    commit 0ef90884f Merge branch 'main' into proj/analytics-system-tool-mode. Verify zero diff vs.
    origin/main before pressing merge.

Risks and mitigations

  • Risk: WebUI typecheck errors on TopicViewerProps block the husky
    pre-push hook and any CI step that runs the full npm run typecheck. Six errors across context-detail-panel.tsx,
    root-index-detail.tsx, curate-tool-mode-sections.tsx,
    task-detail-sections.tsx.

    • Mitigation: not introduced by this PR. Resolves once
      @campfirein/byterover-packages is bumped to the version that
      exports the callback-based TopicViewer API (or build with
      BRV_UI_SOURCE=submodule after git submodule update --init --recursive). Coordinate with the shared-UI submodule owner
      before merging.
  • Risk: large surface area (253 files, around 26k LoC, 108 commits)
    raises review burden.

    • Mitigation: the work landed in 39 small per-milestone PRs into
      the project branch (each independently reviewed and merged).
      This PR is the integration-to-main step. Skim the commit list
      rather than the line-by-line diff.
  • Risk: analytics enabled in error could ship PII.

    • Mitigation: opt-in default-off, a runtime FORBIDDEN_FIELD_NAMES
      redactor with a fixture that fails CI if any event schema
      declares a name on the list, file paths are stripped to
      project-relative or sentinel'd, and the disclosure UX makes the
      opt-in explicit in both CLI and WebUI.
  • Risk: backend outage produces unbounded queue growth.

    • Mitigation: bounded queue with drop-oldest semantics, a JSONL
      cap (JsonlCapFullError), exponential backoff with a
      reachability counter, and brv analytics status surfaces the
      operational state for on-call.
  • Risk: agent processes acquire stale analytics state mid-task.

    • Mitigation: per-agent settings snapshot at fork time
      (agent-settings-snapshot.ts); per-task identity snapshot on
      TaskInfo (clientType + clientName) survives mid-task
      client disconnect.
  • Risk: M15.8 onboarding events stay empty until the agent team
    ships a skill:invoked / skill:completed signal.

    • Mitigation: gap is disclosed in this PR and in the agent-team
      spec at agent-tmp/m15-8-onboarding-signal-spec.md. Backend
      funnel queries should not yet reference these events.

Adds a typed analytics: boolean field to the GlobalConfig domain entity
with safe migration for legacy configs (defaults to false when absent
to preserve the opt-in promise across upgrades). Widens toJson() return
to a typed GlobalConfigJson shape. Type guard rejects non-boolean values
when the analytics key is structurally present.

Tests moved to test/unit/server/core/domain/entities/global-config.test.ts
and extended with the seven scenarios required by the ticket.
Adds the read-only brv analytics status oclif command, which prints
whether CLI analytics is enabled or disabled. Output is "Analytics:
enabled" or "Analytics: disabled" by default; --format json emits the
shape {"analytics": "enabled" | "disabled"} via the repo's standard
writeJsonResponse envelope.

Routing follows login / logout / locations: oclif uses withDaemonRetry
to emit a globalConfig:get transport event (auto-spawns the daemon).
The daemon-side GlobalConfigHandler re-reads the on-disk config every
call (no in-memory cache); on a fresh install it seeds a stable
deviceId via GlobalConfig.create(randomUUID()) so device identity is
available from the first read.

Help text describes what telemetry is collected and links to a
placeholder privacy policy URL pending M1.5.
Adds two oclif commands that toggle the analytics flag in GlobalConfig:
brv analytics enable and brv analytics disable. Output is "Analytics
enabled" / "Analytics disabled" on a state change, or "Analytics
already enabled" / "Analytics already disabled" when the flag was
already in the target state (idempotent, exits 0).

Routing matches M1.2: oclif uses withDaemonRetry to emit the new
globalConfig:setAnalytics transport event. The daemon-side
GlobalConfigHandler gains a SET_ANALYTICS listener that reads the
current config (or seeds a fresh one with a stable deviceId if absent),
applies the new value via GlobalConfig.fromJson round-trip (the entity
is immutable with a private constructor), and writes back. The
idempotent path skips the file write entirely when previous state
matches the requested value, so a disable on a fresh install does not
create a stub config file.

M1.3 ships without the disclosure prompt - the enable command flips
the bit immediately. M1.4 (ENG-2618) follows with the disclosure UX
wrapper. Reviewers should expect that follow-up.

Adds 13 new tests covering all 7 ticket Test plan scenarios:
4 deterministic command-level tests for enable/disable success and
idempotent paths, 1 transport-contract assertion per command, plus
3 handler-level integration tests for SET-then-GET state coherence
and concurrent-write last-writer-wins semantics.
Wraps M1.3's bit-flip path so brv analytics enable shows a disclosure
that the user must accept before the flag flips. CI / non-interactive
use is supported via --yes.

Flow: the command first reads current state via globalConfig:get. If
analytics is already enabled, it prints "Analytics already enabled"
and exits 0 with no prompt and no write. Otherwise it loads the
disclosure markdown, prints it, and either accepts implicitly (--yes),
prompts via @inquirer/prompts confirm (interactive TTY), or refuses
with a clear error (non-TTY without --yes). On accept, the existing
globalConfig:setAnalytics path flips the bit; on reject, "Analytics
not enabled" prints and the command exits 0 without writing.

Disclosure copy lives in src/server/templates/sections/analytics-disclosure.md
with lorem ipsum bodies per hoang's request in the Linear ticket.
Section headers (what / surfaces / where / cross-device alias / how to
disable / privacy policy) are load-bearing for tests and remain stable;
PM and legal will replace bodies before the M1 release. Privacy policy
URL lives in src/shared/constants/privacy.ts as a placeholder pending
the M1.5 docs page; reviewers should update it once the canonical URL
is finalized.

A TODO(M2) marker is preserved in enable.ts so that when IAnalyticsClient
lands the first event sent after enable will be analytics_enabled
itself (industry practice).

Adds 8 tests covering all 7 ticket scenarios: interactive accept,
interactive reject, --yes bypass, already-enabled short-circuit,
non-TTY refusal with non-zero exit, disclosure section coverage, and
privacy URL constant shape.

PM/legal sign-off and the milestone disclosure-file link are manual
steps to record at PR / milestone-update time.
Establishes the consumer-facing contract for the daemon-side analytics
machinery. Future M2 sub-tickets (queue, resolvers, real client,
transport handler, IPC bridge) build against this interface; the no-op
serves as the daemon's default before M2.5 wires the real one and as a
stub for unit tests.

Adds the domain types in src/server/core/domain/analytics/:
  - AnalyticsEvent (internal camelCase: name, properties, timestamp)
  - Identity (wire snake_case: user_id?, device_id, email?, name?)
  - AnalyticsEventWithIdentity (Identity-stamped event)
  - AnalyticsBatchJson (wire shape: schema_version: 1, events)
  - AnalyticsBatch class with private constructor + static create +
    static fromJson (graceful failure, returns undefined on malformed
    input) + toJson, mirroring the GlobalConfig pattern

Adds IAnalyticsClient { track; flush } in core/interfaces/analytics/
and NoOpAnalyticsClient in infra/analytics/. The no-op's track() is a
true no-op (no allocations beyond the call frame, no buffering); its
flush() always resolves to an empty batch.

Tests: 24 cases. AnalyticsBatch round-trips empty + populated batches,
and fromJson rejects 13 distinct malformations (missing schema_version,
schema_version != 1, events not array, missing/non-string event name,
missing identity, missing/empty device_id, non-number timestamp,
non-object properties, etc). NoOpAnalyticsClient verifies track does
not throw under varied input and flush stays empty after many tracks.
Adds the in-memory queue that M2.5's AnalyticsClient will buffer events
into. Configurable cap with default 1000, drop-oldest on overflow, and
a cumulative droppedCount that survives drain calls for later
observability (surfaced in M4 via brv analytics status).

The interface IAnalyticsQueue lives next to its sibling IAnalyticsClient
under core/interfaces/analytics/. The BoundedQueue implementation in
infra/analytics/ uses a plain Array with push + shift; at maxSize=1000
the O(n) shift on overflow is negligible. drain transfers ownership of
the events array to the caller and re-initializes the internal queue,
avoiding a defensive clone. dropped is a private counter that no
method resets.

The constructor validates maxSize is a non-negative integer; negative
NaN, Infinity, and fractional values throw fast at construction time.
Without this check, a negative maxSize would cause an infinite loop in
push(): the `while events.length > maxSize` condition stays true at
length === 0 when maxSize < 0, with shift() as a no-op.

Tests cover the six ticket scenarios plus constructor validation:
FIFO drain, empty-queue defaults, drop-oldest with droppedCount
tracking, multi-drop FIFO order, cumulative droppedCount across
drains, size() invariant, default-cap behavior, drain ownership
transfer, and 5 constructor cases (negative, NaN, Infinity, fractional
all throw; maxSize === 0 is accepted as a degenerate but valid cap).
17 tests total.

Test fixtures use a fresh identity object per makeEvent call rather
than a shared module-scope reference, so future tests cannot
accidentally mutate the shared identity and bleed across cases.

Note: had to introduce a small pushAll(queue, events[]) helper in the
test file because the unicorn/no-array-push-push lint rule fires on
consecutive .push() calls regardless of receiver type. The helper
wraps a for-of loop, which the rule accepts.
Adds the resolver that stamps the five super properties onto every
analytics event: device_id, cli_version, os, node_version, environment.
M2.5's AnalyticsClient will inject this resolver and merge its output
into every track() call.

Wire-format snake_case throughout (matches the rest of the analytics
domain). cli_version, os, node_version, and environment are static
across the daemon's lifetime and lazy-cached on first resolve(). The
package.json reader is invoked exactly once per resolver instance,
verified by a sinon-stub assertion.

device_id is re-read from IGlobalConfigStore on every resolve() call
so a swapped GlobalConfig is observable. The tradeoff: resolve()
returns Promise<SuperProperties> instead of sync SuperProperties,
which cascades async to M2.5's AnalyticsClient.track() (deviation
from M2.5's spec to be addressed when M2.5 lands).

The cli-version reader was duplicated inline in brv-server.ts; this
commit extracts it to src/server/utils/read-cli-version.ts and both
sites now import the single source. The walk-up depth changes from 4
to 3 levels (the new utility lives one level closer to the project
root) but resolves to the same package.json.

environment defaults to 'production' when BRV_ENV is unset or set to
any value other than 'development'. Fail-safe in unfamiliar territory.

10 tests cover the seven ticket scenarios plus three additional cases:
device_id re-read on every call, environment default when BRV_ENV is
unset, and BRV_ENV=production explicitly. Tests save & restore
process.env.BRV_ENV per scenario to avoid cross-test pollution.
Adds IdentityResolver — produces the per-event Identity that M2.5 will
stamp onto every analytics event. Anonymous: {device_id} only.
Registered: {user_id, email?, name?, device_id} where empty user fields
are OMITTED (not present as undefined) so the wire envelope stays
clean for downstream serializers.

Per ticket DoD, defines a consumer-side IAuthStateReader interface
(1 method: getToken()) co-located with IIdentityResolver in
core/interfaces/analytics/. The full IAuthStateStore is broader; the
resolver only needs sync access to the cached token. This keeps the
auth module unaware of the analytics consumer, matching CLAUDE.md
"interfaces at the consumer".

resolve() is async because IGlobalConfigStore.read() is async (same
precedent as M2.3 super-properties resolver). Each call re-reads both
sources so auth-state transitions mid-batch are observable to M2.5
(M2.5 Test #3 requires this).

Empty-field omission uses conditional spread:
  ...(token.userEmail ? {email: token.userEmail} : {})
which OMITS the key entirely. Tests assert via chai's
.to.not.have.property('email') to distinguish absent keys from
explicit-undefined values.

9 tests cover all 6 ticket scenarios plus 2 bonus cases (empty
userName, missing GlobalConfig). Auth-transition tests use a mutable
authReader wrapper to flip identity between resolve() calls,
verifying per-call freshness with no caching.
Composes M2.2 (queue), M2.3 (super-props), M2.4 (identity) into the
daemon-scoped AnalyticsClient. GlobalConfigHandler now caches the
analytics flag synchronously so AnalyticsClient.isEnabled stays sync;
refreshCache() is awaited in setupFeatureHandlers before construction
so the very first track() (daemon_start) sees the correct flag.

Adds an ESLint no-restricted-imports rule preventing IAnalyticsClient
from being imported outside src/server/infra/; non-daemon consumers
should reach analytics through the analytics:track transport event
(M2.6).
- refreshCache(): catch block now explicitly sets cachedAnalytics=false
  on read failure. Under the prior fix the field defaulted to undefined,
  so a silent catch left getCachedAnalytics() throwing — which would
  crash the daemon on bootstrap if any IGlobalConfigStore implementation
  ever rejected from read(). Production FileGlobalConfigStore catches
  its own errors, but defense-in-depth restores the comment's promise.

- GlobalConfigHandler class JSDoc: corrected stale "populated at
  construction" wording. The eager constructor-refresh was removed
  during the prior fix-pass; consumers now MUST await refreshCache().

- AnalyticsClient class JSDoc: tightened the no-crash invariant. The
  guarantee covers async resolver/queue errors only — a sync isEnabled()
  throw (e.g. getCachedAnalytics before refreshCache) propagates by
  design to surface bootstrap-misconfiguration loudly.

- New integration test locks the refreshCache fail-safe: a stub
  IGlobalConfigStore whose read() rejects must leave the cache at false
  (NOT undefined), and getCachedAnalytics() must NOT throw afterwards.
Implements M2.6 transport mechanism for analytics:

- analytics:track transport event with wire-level Zod validation
  (src/shared/transport/events/analytics-events.ts,
   AnalyticsTrackPayloadSchema in core/domain/transport/schemas.ts)
- Daemon-side AnalyticsHandler routes valid payloads to
  AnalyticsClient.track (src/server/infra/transport/handlers/
   analytics-handler.ts, wired in feature-handlers.ts)
- Shared emitAnalytics helper at src/shared/analytics/emit.ts for
  in-process consumers (TUI, oclif, MCP, webui)
- Round-trip integration test exercises stub-emit -> handler -> queue
  with full identity + super-properties stamping

Deviation from ticket spec (see ENG-2628 Linear comment):
- Oclif sample consumer (cli_invocation from lifecycle hook) NOT
  shipped -- team decided against oclif lifecycle hooks.
  Follow-up ticket needed.

Tests added: 21 (schema 9, handler 5, helper 4, integration 3).
Schema-only milestone. No emitter wiring, no daemon-side validation.
Each shipped analytics event now has a TypeScript type plus a Zod
.strict() runtime schema, and typed enums eliminate magic strings.

Events covered (7):
- daemon_start (M2.5): empty schema; super-properties cover cold-start
- cli_invocation (M2.6): command_id, flag_names, is_tty, is_ci,
  terminal_program?, runtime, package_manager
- mcp_session_start (M2.6): client_name
- mcp_tool_called (M2.6): tool_name, client_name, success, duration_ms
- task_created: task_type, task_id, has_files, has_folder
- task_completed: task_type, task_id, duration_ms
- task_failed: task_type, task_id, duration_ms

Typed enums:
- AnalyticsEventNames (7 entries): wire-format event names
- TaskTypes (5 entries): mirrors daemon TaskInfo.type union
- TASK_TYPE_VALUES tuple: single source of truth used by every task_*
  schema's z.enum(...) call

Privacy fixture walks ALL_EVENT_SCHEMAS and rejects any field name on
the forbidden PII list (cwd, argv, path, email, result, ...). Adding
a banned field name later fails the build.

Wire-shape source of truth: AnalyticsTrackPayloadSchema lives in
shared/transport/events/analytics-events.ts (alongside the
AnalyticsEvents constants). Both the daemon handler and the shared
emitAnalytics helper consume the same Zod-derived
AnalyticsTrackPayload type, so the wire shape can no longer drift
between two declarations. The previous server-side duplicate has
been removed.

Drift detection: a new compile-time + runtime test asserts the
shared TaskTypes enum stays in lock-step with the server-side
TaskTypeSchema. If a daemon contributor adds a new TaskInfo.type
value and forgets to mirror it, the build fails loudly instead of
emitting events that silently fail wire-side validation.

Magic-string cleanup: the only existing analytics emit site at
brv-server.ts now passes AnalyticsEventNames.DAEMON_START instead
of the raw string literal.

Deliberate deviations from ticket text:
1. command_id is z.string() not a typed enum: oclif manifest is the
   source of truth (~80 commands), a hardcoded mirror would rot.
2. Emitter wiring (oclif init, MCP oninitialized, MCP wrapper) and
   TaskAnalyticsHook are deferred to a follow-up ticket; this
   milestone ships schema declarations only.
3. task_failed has no error_class/error_code: that would require a
   breaking ITaskLifecycleHook.onTaskError signature change which
   belongs in its own ticket.

29 files changed, 0 files modified outside this milestone's scope.
Tests: 7500 passing (+74 new under shared/analytics/ + drift + payload
schema). Lint, typecheck, build all green.
… depth)

Four review-driven hardening fixes on top of the M2.8 catalog. Zero
behavior changes for existing wire/runtime callers; all changes are
type-tightening or test-quality improvements.

NF2 — Typed `emitAnalytics`. The shared emit boundary now takes a
generic `<E extends AnalyticsEventName>` and derives the properties
type from the M2.8 discriminated union. Magic-string typos
(`'daemon_starts'`) and wrong-shape payloads
(`tool_name` on `cli_invocation`) become compile errors instead of
runtime drops. The properties argument is optional only for events
whose schema has no required keys (e.g. `daemon_start`); other events
require a fully-shaped payload.

NF5 — Replace batch.ts hand-rolled type guards with Zod. The three
`as Record<string, unknown>` casts (which violated CLAUDE.md
"avoid `as Type` assertions") are gone, replaced by
IdentityWireSchema / AnalyticsEventWithIdentityWireSchema /
AnalyticsBatchJsonSchema. All 19 existing fromJson rejection tests
still pass — Zod preserves the same edge cases (empty/whitespace
device_id, missing schema_version, non-object properties, etc.) at
~half the line count.

I5 — Privacy fixture walker recurses into nested ZodObject,
ZodArray elements, and ZodOptional / ZodNullable wrappers. Three
new regression tests verify the walker catches forbidden names
(`email`, `password`, `token`, `api_key`) embedded inside nested,
arrayed, and optional/nullable shapes. Today's schemas are flat, but
the fixture now stays correct as future schemas grow.

I6 — Extend FORBIDDEN_FIELD_NAMES from 20 to 41 entries: adds
secrets (`access_token`, `auth_token`, `api_key`, `cookie`,
`credential`, `password`, `secret`, `session_id`, `session_token`,
`token`, `auth_header`), additional PII (`address`, `display_name`,
`first_name`, `last_name`, `phone`, `phone_number`, `ssn`,
`username`). The list now matches a realistic threat model for the
analytics pipeline.

NF7 — Already covered by existing test #6 in
test/commands/analytics/enable.test.ts (retracted; my round-4 review
hallucinated the gap because I read only the first 80 lines of the
file).

Tests: 7503 passing (+3 walker tests). Lint, typecheck, build all
green. No production wiring changed; the only existing emit site
(brv-server.ts) still calls `analyticsClient.track(...)` directly
and is unaffected.
- New stored-record.ts: StoredAnalyticsRecord extends AnalyticsEventWithIdentity
  with {id, status, attempts}; Readonly via z.infer wrapper for consistency
  with rest of analytics domain (Identity, AnalyticsEvent, batch types)
- Zod schema for read-from-disk validation; default strip mode (matches
  batch.ts precedent, forward-compat with future field additions)
- Export MAX_ATTEMPTS=3 constant for M9.2's updateStatus retry-cap policy
- toWireEvent helper strips local-only fields (id, status, attempts) for
  M4 backend ship path
- batch.ts and existing M2 wire types unchanged (boundary preserved)
- 23 unit tests: 1 const + 16 schema + 6 helper; full suite 7526 passing
- New IJsonlAnalyticsStore interface: append/updateStatus/list/loadPending
  plus droppedSentCount/droppedFullCount counters for M4.6 observability
- New JsonlAnalyticsStore class: file-backed JSONL at
  {baseDir}/analytics-queue.jsonl
- Retry-cap policy lives inside updateStatus(_, 'failed'): increment attempts;
  stay 'pending' while attempts < MAX_ATTEMPTS; flip terminal 'failed' at cap;
  no overshoot on terminal rows
- Write serialization via private writeChain Promise mutex; eliminates
  appendFile vs readFile/rename interleave race for concurrent track + flush
- Atomic rewrite: tmp file + rename (mirrors FileQueryLogStore.writeAtomic)
- Append uses FileHandle.sync() for fsync durability so row survives daemon kill
- File-size cap (default 5000 rows / 10MB): drop oldest sent first; pending and
  failed never dropped; if cap full of pending+failed, append silent no-op
  plus droppedFullCount++
- list sort: (timestamp DESC, id DESC) for stable same-timestamp ordering
- Read methods (list, loadPending) skip the write chain; consistent snapshot
  via atomic rename
- Corruption tolerance: skip JSON.parse-failures and Zod-schema-failures
- 30 unit tests using tmpdir(); full suite 7554 passing
…sync (M9.3)

- Widen IAnalyticsQueue + BoundedQueue from AnalyticsEventWithIdentity to
  StoredAnalyticsRecord; id propagates through the queue
- AnalyticsClientDeps adds jsonlStore: IJsonlAnalyticsStore
- trackAsync: resolve identity + super-props, generate randomUUID id,
  compose StoredAnalyticsRecord{status:'pending', attempts:0},
  await jsonlStore.append() FIRST, then queue.push() on success
- JSONL append failure: queue NOT pushed, silent drop preserves
  "JSONL is source of truth" invariant; "MUST NOT crash consumer" honored
- feature-handlers.ts constructs ONE JsonlAnalyticsStore({baseDir:
  getGlobalDataDir()}) and injects into AnalyticsClient. Same instance
  will be shared with M11.2's analytics-list-handler
- 6 new unit tests covering dual-write happy path, JSONL-fail fallback,
  uuid distinctness, queue.size() = JSONL row count, disabled no-op
- All existing M2 AnalyticsClient + BoundedQueue tests pass with widened
  type
- Integration tests (daemon-tracking, transport) updated with real
  JsonlAnalyticsStore({baseDir: testDir}) per tmpdir convention
- Full suite 7562 passing
Comment thread src/server/infra/analytics/analytics-client.ts
}

this.rateLimited = true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: The Math.min(retryAfterMs, MAX_SERVER_HINT_MS) clamp + Number.isFinite guard is the right calibration. Two distinct correctness wins per the docblock comment:

  1. setTimeout overflow prevention — Node's setTimeout silently fires immediately for any delay above 2^31-1 ms (~24.8 days), which would ignore a hostile/buggy Retry-After. Cap at 1h prevents that class of bug architecturally.
  2. DoS-resilience — a server (or proxy) sending an Infinity-shaped Retry-After value cannot stall analytics shipping for days, only an hour.

The test at analytics-backoff-policy.test.ts:153-163 pins the four edge cases (0, -1, NaN, Infinity) — all flip isRateLimited to true without changing the schedule floor. That matches the applyServerHint(NaN) sentinel pattern in analytics-client.ts:310 used when a sender violates the contract by omitting retryAfterMs. Clean composition.

minor docblock nit (non-blocking): Line 14's "1 hour" cap reasoning is well-explained, but could also mention that this caps the worst-case JSONL retry-cap exhaustion window — MAX_ATTEMPTS = 3 × 1h = 3h max throttle window before records terminate. That's load-bearing for the data-fidelity argument under rate-limit (and ties into my separate finding about runFlush line 379 marking rate-limited records as 'failed' despite the policy treating the endpoint as healthy).

Comment thread src/server/infra/analytics/draining-analytics-sender.ts
// retry-cap still bumps attempts and eventually terminates the rows
// — same terminal classification any other failure reason gets.
return {failed: [...ids], reason: 'http_4xx', succeeded: []}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise — Finding C closed. First-pass review flagged this branch as returning {failed: [...ids], succeeded: []} with no reason, which made the M4.5 backoff policy's "skip" branch fire (since reason === undefined && succeeded.length === 0) while the M9.2 retry-cap kept ticking — silently exhausting attempts toward terminal 'failed' on a misconfigured daemon.

Now tagged as 'http_4xx', which routes through feedBackoffPolicy's payload-shape branch at analytics-client.ts:280-290: the backoff policy correctly suppresses advancement (no churn on a daemon-side misconfig), AND the retry-cap still ticks → records eventually surface in brv settings get analytics.status as failed, signaling the operator to investigate the missing device_id rather than letting it loop indefinitely. The comment at lines 64-70 spells out the reasoning precisely. Clean close.

Comment thread src/server/infra/process/analytics-hook.ts
…ct(), failure-kind, dedup, log safety

Branch: proj/analytics-system-tool-mode

- format-settings: map the 'analytics' settings category (was falling through to 'other')
- batch: .strict() on IdentityWireSchema — reject unexpected identity fields at the wire boundary
- context-tree-handler: classify path-traversal as failure_kind 'invalid_path' instead of 'conflict'
- context-tree-events: remove duplicate JSDoc on the paths field
- brv-server: consolidate readCliVersion onto the shared src/server/utils/read-cli-version util
- synthetic-tool-result-emit: wrap log calls so a throwing sink can't escape (sync + async paths)
- analytics-client: document why onAuthTransition intentionally does not await the in-flight flush
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 30, 2026

Claude finished @bao-byterover's task in 3m 57s —— View job


Claude PR Review — Analytics System (M1 → M17) — Fifteenth pass

Overall score: 3 / 5 — Same engineering substance carrying forward (durable JSONL queue, careful failure-mode taxonomy, exceptional test coverage). The new 5364e7d push is another productive review-response bundle: it extends .strict() defenses into the nested IdentityWireSchema, hardens the synthetic-tool-result log path against a throwing sink, reclassifies path-traversal as invalid_path rather than conflict, maps the analytics settings category onto the UI rows instead of falling through to other, deduplicates readCliVersion onto the shared util, and lands a 10-line decision-record docblock on why onAuthTransition deliberately does NOT await the in-flight flush. Score held at 3/5 because the single ship-blocker from the prior fourteen reviews — lorem-ipsum disclosure body — is still in the tree.

Todos

  • Identify new commits since the fourteenth review (5364e7d — review-response bundle; merge-commit 28672fb pulled in unrelated webui migrate UI from main)
  • Re-check analytics-disclosure.md (15× prior ship-blocker — still lorem ipsum in both src/shared/assets/analytics-disclosure.md and src/webui/features/analytics/constants.ts:13-39)
  • Spot-check 5364e7d changes (batch .strict(), synthetic-emit log safety, invalid_path taxonomy, format-settings category mapping, brv-server dedup, analytics-client docblock)
  • Re-verify other prior unresolved findings (Finding A escalated 15th flag; Finding from 14th-pass rate-limited retry-cap still unresolved)
  • Post inline issues + summary

What's new since the fourteenth review (1 substantive commit)

  • 5364e7d — Six-point review-response bundle:
    1. batch.ts:33-45.strict() extended into the nested IdentityWireSchema. Closes a defense-in-depth gap the prior reviews didn't catch explicitly: without nested strict, a fork shipping {identity: {device_id, secret_kid: '...'}} would slip past the envelope-level strict. New test at batch.test.ts:151-164 pins the rejection on an unexpected extra_field inside identity.
    2. synthetic-tool-result-emit.ts:52-73safeLog wrapper around log calls so a throwing logging sink can't escape (sync path) or become an unhandled rejection (async path). New test at synthetic-tool-result-emit.test.ts:128-141 pins the sync-throw side.
    3. context-tree-handler.ts:312-315 — Path-traversal classified as invalid_path (rejected-input/security signal) rather than conflict (write-collision). Existing test updated accordingly.
    4. format-settings.ts:163-168'analytics' added to the category whitelist so analytics settings render under their own category instead of falling through to other. New test pins it.
    5. brv-server.tsreadCliVersion() removed and replaced by import from the shared util at src/server/utils/read-cli-version.ts. Cleaner.
    6. analytics-client.ts:219-228 — 10-line decision-record docblock on onAuthTransition explaining why the in-flight pendingFlush is intentionally NOT awaited (safety invariant, M4.4 pre-transition flush hook ownership, no correctness gain from awaiting).
      Also: context-tree-events.ts removed a duplicate JSDoc comment.

(The merge-commit 28672fb pulled in webui migrate-prompt UI changes from main — unrelated to this PR's scope; flagged for completeness, not reviewed.)

New findings on this pass

  1. analytics-disclosure.md — ship-blocking, fifteenth flag. No change. (inline)
  2. batch.ts:45 — praise for the nested .strict(). (inline)
  3. analytics-client.ts:228 — praise for the auth-transition flush docblock. Names the alternative, the safety invariant, the policy reason, and what would be wrong if a future maintainer added the await. (inline)
  4. synthetic-tool-result-emit.ts:73 — praise for the safeLog wrapper. Closes a real footgun where a throwing log sink could either escape the sync path or become an unhandled rejection on the async catch. (inline)
  5. context-tree-handler.ts:315 — praise for the invalid_path taxonomy fix + pinned-value test discipline. (inline)
  6. format-settings.ts:169 — observation on the dual-declaration of categories (descriptor source-of-truth at settings.ts:197-205 + allow-list re-declaration here). Worth considering a single canonical list export. (inline)

Prior findings — status

# File Finding Status
A analytics-disclosure.md + webui/features/analytics/constants.ts:13-39 Lorem-ipsum body ship-blocker, 15th flag
B shared/constants/privacy.ts:6 PRIVACY_POLICY_URL (placeholder until M1.5) user-facing copy resolved (ENG-3007 removed surfacing commands); URL inconsistency vs. WebUI remains
C http-analytics-sender.ts:62-69 No-deviceId returns no reason, retry-cap toward terminal 'failed' resolved earlier by ENG-2658 (3263474)
D analytics-hook.ts:88 <outside-project>/${basename(filePath)} leaks filename leaf resolved earlier by ENG-2658
E analytics-handler.ts dispatch switch 48 cases scaling poorly vs. table-driven dispatch unresolved (non-blocking)
F event-names.ts:32 MCP_SESSION_START vs. MCP_SESSION_ENDED tense asymmetry unresolved (non-blocking)
G auth-state-store.ts:175 Sequential pre-change listener wait — comment justifying serial ordering would help unresolved (non-blocking)
H analytics-client.ts:248 Microscopic pendingTracks.add race in track() — undocumented unresolved (non-blocking)
I migrate-handler.ts:132-140 classifyMigrateFailure brittle string-prefix match resolved earlier by ENG-2658
J synthetic-tool-result-emit.ts:96-109 TOOL_CALL → TOOL_RESULT pairing relies on socket.io per-socket FIFO unresolved (non-blocking)
K format-analytics-status.ts:114 Import-time side-effect registration unresolved (non-blocking)
L webui/pages/configuration/general.tsx:3 Style nit resolved earlier (moved to Privacy tab)
M noop-analytics-sender.ts vs. no-op-analytics-sender.ts Twin classes named one hyphen apart resolved earlier by ENG-2658 (rename to DrainingAnalyticsSender)
N onboarding.md:290-292 Verbal-script scope gap vs. canonical disclosure unresolved (bundled into disclosure-body fix round)
O onboarding.md:299-303 "Two promises" framing vs. code reality mitigated by settings.ts:200 descriptor honesty
P onboarding.md:305-308 "Do not re-ask in future sessions" without persisted declined state unresolved (non-blocking)
Q batch.ts:24 schema_version: 1 despite breaking wire change resolved earlier (3568d4b)
R stored-record.ts:99-109 toWireEvent derivation caveats resolved earlier (3568d4b)
S auth-handler.ts:453 Rotate-after-emit TOCTOU on device_id unresolved (non-blocking)
T i-global-config-rotator.ts:19 Non-idempotent contract undocumented unresolved (non-blocking)
14-2 analytics-client.ts:388-390 Rate-limited records bump M9.2 retry-cap toward terminal 'failed' (symmetric to M4.4 abort gate) unresolved (non-blocking)
U webui/features/analytics/constants.ts:11 docs.byterover.dev vs. canonical byterover.dev/privacy URL drift unresolved (non-blocking)
V webui/features/analytics/constants.ts:13-39 Disclosure body duplicated across markdown + WebUI constants folded into Finding A

What's good in this push (5364e7d)

  • Layered defense via nested .strict(). The envelope-level strict and the identity-level strict together close the "anywhere in the wire payload" surface — a forbidden field inserted into either layer is rejected at the wire boundary, not silently stripped.
  • Log-sink hardening. The safeLog wrapper turns "logging is best-effort" from documentation into runtime guarantee. The kind of fix that prevents a single misbehaving custom sink from cascading into a process-level crash through the synthetic-emit path.
  • Decision-record docblock on onAuthTransition. Documents the intentional non-await and the four distinct truths that make it sound. Future refactor that adds the await without reading this will trip the docblock first.
  • invalid_path taxonomy is a precise win for the funnel: path-traversal attempts are operationally distinct from write-collisions; conflating them under conflict would hide a security-relevant signal in the dashboard's collision rate.
  • analytics settings category mapping ensures analytics.enabled renders under a dedicated category in the settings UI instead of other. Small UX cleanup with a pinned test.
  • readCliVersion consolidation. Removes a 20-line duplicate from brv-server.ts and points at the canonical util — that helper was already used elsewhere, so this is straight code-debt repayment.

Items outside the inline scope (carried forward)

  • WebUI typecheck still broken per PR description — confirm @campfirein/byterover-packages bump before merge or husky pre-push blocks.
  • build:ui not re-verified — gated on the same package bump.
  • x-byterover-session-id header in axios-analytics-http-client.ts:81 — if sessionKey ever upgrades to a bearer-equivalent secret, the analytics POST endpoint becomes an auth surface. Note in the backend contract.

Verdict

One ship-blocker, unchanged for the fifteenth review in a row. Replace lorem-ipsum disclosure copy in both src/shared/assets/analytics-disclosure.md and src/webui/features/analytics/constants.ts:13-39 with PM/legal-approved text. Resolve the byterover.dev/privacy vs. docs.byterover.dev/privacy URL drift (Finding U) in the same round. The 5364e7d review-response bundle itself is the second consecutive productive review-response commit on this branch — the only finding from the fourteenth pass that didn't get closed in this push is the rate-limited retry-cap symmetric-with-abort-gate observation, which remains recommended-but-not-blocking.
· branch: proj/analytics-system-tool-mode

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Event names and
super properties (`device_id`, `cli_version`, `os`, `node_version`,
`environment`) are recorded. No content of your queries, files, or
memory is collected.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (ship-blocking — fifteenth flag): Unchanged for the fifteenth review in a row. Every section body still opens with literal Lorem ipsum. The new push (5364e7d) closes a productive bundle of prior review items (.strict() on the nested IdentityWireSchema, log-sink safety on the synthetic-tool-result path, path-traversal classified as invalid_path rather than conflict, analytics settings category no longer falling through to other, deduplicated readCliVersion, and an excellent decision-record docblock on why onAuthTransition deliberately does NOT await the in-flight flush) — making the lorem-ipsum disclosure body the sole substantive item still standing between this branch and merge-eligibility.

Four formal consent surfaces continue to render this exact text (oclif text-mode brv settings set analytics.enabled true, oclif text+--yes, TUI confirm-disclosure overlay, agent-driven brv settings set analytics.enabled true --yes from the onboarding skill), plus the WebUI Privacy panel renders parallel lorem-ipsum from src/webui/features/analytics/constants.ts:13-39 default-open since 5d96619.

The fix is one file (plus the WebUI constants for parity) — no external dependency, no submodule bump, no agent-team handoff. With ENG-3019's device_id rotation and ENG-3020's space_id / team_id stamps on the wire, the eventual PM/legal round should also cross-check the "What is collected" enumeration against ALL_EVENT_SCHEMAS and surface the rotation behavior as a privacy-positive feature in "Cross-device alias".

Fix this →

Comment thread src/server/core/domain/analytics/batch.ts
Comment thread src/server/infra/analytics/analytics-client.ts
Comment thread src/server/infra/process/synthetic-tool-result-emit.ts
Comment thread src/server/infra/transport/handlers/context-tree-handler.ts
category === 'llm' ||
category === 'task-history' ||
category === 'updates'
) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observation (non-blocking): Adding 'analytics' to the category whitelist is a clean fix — analytics.enabled was previously routing to the other category in buildSettingsRows, which meant the settings UI grouped it with miscellaneous keys rather than under a dedicated Analytics category. The new test at format-settings.test.ts:151-153 pins the mapping.

minor architectural nit (non-blocking): This is the second place in the codebase where the analytics category needs an explicit allow-list entry — the descriptor in src/server/core/domain/entities/settings.ts:197-205 already declares category: 'analytics' (the source of truth), and this allow-list re-declares the same fact in a parallel data structure. A future "context-tree" / "memory" / "swarm" category will face the same dual-declaration. Worth considering a single canonical list of category names exported from settings.ts and imported here, so adding a category is one edit rather than two. Not blocking on this PR.

…nabled to analytics.share (#746)

The key name now reflects what it controls: local analytics tracking is always
on (regardless of auth or this setting); the flag only gates whether collected
events are shared with the remote telemetry backend.

- Rename the user-facing key value analytics.enabled -> analytics.share across
  settings set/get/list, the onboarding opt-in script, and the disclosure
  command reference. Hard rename (no alias) — the feature is unreleased.
- Behavior unchanged: tracking stays always-on locally; the flag still gates
  only remote sharing.
- Internals kept (value-only rename): GlobalConfig.analytics field, the
  globalConfig:setAnalytics transport event, in-process method names, and the
  key-constant identifier names.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 30, 2026

Claude finished @bao-byterover's task in 3m 29s —— View job


Claude PR Review — Analytics System (M1 → M17) — Sixteenth pass

Overall score: 3 / 5 — Same engineering substance carrying forward (durable JSONL queue, careful failure-mode taxonomy, exceptional test coverage). The new ENG-3035 commit (0f44a29) is a precise, value-only rename of the user-facing key (analytics.enabledanalytics.share) that materially improves the trust-separation framing — the verb "share" makes it linguistically obvious that the toggle gates remote sharing, not local capture. That partially closes the eighth-pass finding O on the onboarding doc's "two promises" framing, without changing internals. Score held at 3/5 because the single ship-blocker from the prior fifteen reviews — lorem-ipsum disclosure body — is still in the tree.

Todos

  • Identify new commits since the fifteenth review (0f44a29 — ENG-3035 value-only rename)
  • Re-check analytics-disclosure.md (15× prior ship-blocker — still lorem ipsum)
  • Spot-check 0f44a29 changes + verify rename consistency across surfaces
  • Re-verify other prior unresolved findings
  • Post inline issues + summary

What's new since the fifteenth review (1 commit)

  • 0f44a29 (ENG-3035) — Hard rename of the user-facing key value analytics.enabledanalytics.share across 27 files. The rename is value-only by design (per the commit message): GlobalConfig.analytics field, the globalConfig:setAnalytics transport event, in-process method names (setAnalytics, analyticsClient), and the key-constant identifier names (ANALYTICS_ENABLED_KEY, SETTINGS_KEYS.ANALYTICS_ENABLED) all keep their pre-rename identifiers. The rename touches the wire key value, every place the user-facing command form appears in docs/comments/error messages, the e2e harness, the onboarding script, the disclosure markdown's "How to disable" command form (line 31), and the test cases asserting the wire key. No alias path — the feature is unreleased, so the rename is clean-break.

New findings on this pass

  1. analytics-disclosure.md — ship-blocking, sixteenth flag. Same body, key name on line 31 now reads brv settings set analytics.share false. Lorem-ipsum bodies otherwise unchanged. The rename means PM/legal can now lean on the verb in the key name itself to carry the trust-separation framing in the eventual real copy. Inline →

  2. settings-keys.ts:13 — code-quality drift from value-only rename, non-blocking. ANALYTICS_ENABLED_KEY = 'analytics.share' — identifier and value now disagree. Same on SETTINGS_KEYS.ANALYTICS_ENABLED, the setAnalytics method, globalConfig:setAnalytics transport event, and test/commands/settings/analytics-enabled.test.ts filename. Defensible scope (no internal churn), but future maintainers grep'ing in either direction take an extra hop. Two cheap follow-ups: identifier-rename pass OR a one-line redirect comment. Inline →

  3. settings.ts:200 — praise. The descriptor sentence ("Send anonymous telemetry to ByteRover. Local tracking is always on.") + the renamed key (analytics.share) + the (eventual) real disclosure body form a three-way agreement on the trust-separation framing. Recommend reusing this descriptor sentence verbatim as the lead paragraph of the disclosure-md "What is collected" section so the three surfaces stay locked in step. Inline →

  4. onboarding.md:290 — verbal-script gaps re-evaluated post-rename, non-blocking. The eighth-pass finding O ("two promises framing") is now further mitigated by the rename: a user typing brv settings set analytics.share false sees "share" not "enabled," reaching the right mental model linguistically. Still unresolved: super-properties undersold (no device_id / os / space_id / team_id), cross-device alias omitted (permanence + ENG-3019 rotation softening neither mentioned), no persisted declined state vs. "Do not re-ask in future sessions." Bundle into the PM/legal copy round. Inline →

Prior findings — status

# File Finding Status
A analytics-disclosure.md + webui/features/analytics/constants.ts:13-39 Lorem-ipsum body ship-blocker, 16th flag
B shared/constants/privacy.ts:6 PRIVACY_POLICY_URL (placeholder until M1.5) user-facing copy resolved; URL inconsistency vs. WebUI remains
C http-analytics-sender.ts:62-69 No-deviceId returns no reason resolved by ENG-2658
D analytics-hook.ts:88 <outside-project>/${basename} leaks filename leaf resolved by ENG-2658
E analytics-handler.ts dispatch switch 48 cases scaling poorly unresolved (non-blocking)
F event-names.ts:32 MCP_SESSION_START tense asymmetry unresolved (non-blocking)
G auth-state-store.ts:175 Sequential pre-change listener wait — undocumented unresolved (non-blocking)
H analytics-client.ts:248 Microscopic pendingTracks.add race unresolved (non-blocking)
I migrate-handler.ts:132-140 Brittle string-prefix match resolved by ENG-2658
J synthetic-tool-result-emit.ts:96-109 TOOL_CALL → TOOL_RESULT ordering unresolved (non-blocking)
K format-analytics-status.ts:114 Import-time side-effect registration unresolved (non-blocking)
L webui/pages/configuration/general.tsx:3 Style nit resolved earlier
M noop-analytics-sender.ts vs. no-op-analytics-sender.ts Twin classes one hyphen apart resolved by ENG-2658
N onboarding.md:290-292 Verbal-script scope gap partially mitigated by ENG-3035 rename; "two promises" framing component closed
O onboarding.md:299-303 "Two promises" framing vs. code reality further mitigated by ENG-3035 — the key value itself now reads share
P onboarding.md:305-308 No persisted declined state unresolved (non-blocking)
Q batch.ts:24 schema_version: 1 despite breaking wire change resolved earlier
R stored-record.ts:99-109 toWireEvent derivation caveats resolved earlier
S auth-handler.ts:453 Rotate-after-emit TOCTOU unresolved (non-blocking)
T i-global-config-rotator.ts:19 Non-idempotent contract undocumented unresolved (non-blocking)
14-2 analytics-client.ts:388-390 Rate-limited records bump retry-cap unresolved (non-blocking)
U webui/features/analytics/constants.ts:11 docs.byterover.dev URL drift unresolved (non-blocking)
V webui/features/analytics/constants.ts:13-39 Disclosure body duplicated across markdown + WebUI folded into Finding A
16-NEW settings-keys.ts:13 Value-only rename leaves identifier/value name drift unresolved (non-blocking)

What's good in this push (ENG-3035)

  • Linguistic improvement. analytics.share carries the trust-separation framing in the key name itself. A user reading brv settings list sees a verb that means what it does. That's a real product-language win for an opt-in surface.
  • Hard rename discipline. No alias path, no backwards-compat shim, no deprecation warning. Because the feature is unreleased, the rename is clean-break — exactly the right call for pre-release maintenance.
  • Internal stability preserved. The deliberate value-only scope keeps the transport-event names, the in-process method names, and the constant identifiers fixed — no churn in GlobalConfigHandler, the wireAnalyticsHttpSender shape, or any test that doesn't directly assert the wire key. Minimizes regression risk on a long-running branch.
  • Test coverage stays pinned. 17 tests updated to assert 'analytics.share' rather than 'analytics.enabled'; the disclosure-content test at analytics-disclosure-content.test.ts:32 now pins the post-M16.4 disable command form so a future regression that re-introduces brv analytics disable fails loudly.
  • Comment + docblock sweep is thorough. grep confirms zero stale analytics.enabled user-facing references in source; the only analytics_enabled mentions are intentional ones in global-config-handler.ts:212 (referring to the event analytics_enabled which the code intentionally does NOT emit) and the test at global-config-handler.test.ts:532 (asserting that non-emission).

Items outside the inline scope (carried forward)

  • WebUI typecheck still broken per PR description — confirm @campfirein/byterover-packages bump before merge or husky pre-push blocks.
  • build:ui not re-verified — gated on the same package bump.
  • x-byterover-session-id header in axios-analytics-http-client.ts:81 — backend contract note still pending.

Verdict

One ship-blocker, unchanged for the sixteenth review in a row. Replace lorem-ipsum disclosure copy in both src/shared/assets/analytics-disclosure.md and src/webui/features/analytics/constants.ts:13-39 with PM/legal-approved text. The ENG-3035 rename makes the PM/legal round materially easier — the key name now does linguistic work the disclosure body used to have to do — so the rewrite can be tighter than it would have been before. Resolve the byterover.dev/privacy vs. docs.byterover.dev/privacy URL drift (Finding U) and the verbal-script gaps (Finding N) in the same round. Every other finding is recommended-but-not-blocking.
· branch: proj/analytics-system-tool-mode

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Event names and
super properties (`device_id`, `cli_version`, `os`, `node_version`,
`environment`) are recorded. No content of your queries, files, or
memory is collected.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (ship-blocking — sixteenth flag): Unchanged for the sixteenth review in a row. Every section body still opens with literal Lorem ipsum. The new push (0f44a29) is a purely-mechanical rename of the user-facing key value from analytics.enabledanalytics.share — touches the disclosure markdown at one line (line 31, the "How to disable" command form) but leaves every body opening paragraph as placeholder copy.

Four formal consent surfaces continue to render this exact text (oclif text-mode brv settings set analytics.share true, oclif text+--yes, TUI confirm-disclosure overlay, agent-driven --yes from the onboarding skill at onboarding.md:296), plus the WebUI Privacy panel default-renders parallel lorem-ipsum from src/webui/features/analytics/constants.ts:13-39.

The fix is two files (this markdown + the WebUI constants) — no external dependency, no submodule bump, no agent-team handoff. With the rename to analytics.share, the eventual PM/legal round can now lean on the key name itself to carry the trust-separation framing ("share" makes it linguistically obvious that local capture and remote sharing are two different things). Recommend the "What is collected" body explicitly enumerate device_id, cli_version, os, node_version, environment, space_id, team_id, and the "Cross-device alias" body surface the device_id-rotation-on-logout/account-switch/refresh-failure behavior as a privacy-positive feature.

Fix this →

* is a typecheck error at every consuming site.
*/

export const ANALYTICS_ENABLED_KEY = 'analytics.share' as const
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observation (code-quality drift, non-blocking): The ENG-3035 commit message is explicit that this rename is value-only: "Internals kept (value-only rename): GlobalConfig.analytics field, the globalConfig:setAnalytics transport event, in-process method names, and the key-constant identifier names." That's a defensible scope (no internal churn, no API-shape change on the transport layer), but it leaves a name/value mismatch at the seams:

  • This constant is named ANALYTICS_ENABLED_KEY but evaluates to 'analytics.share'.
  • SETTINGS_KEYS.ANALYTICS_ENABLED at src/server/core/domain/entities/settings.ts:121 is {ANALYTICS_ENABLED: ANALYTICS_ENABLED_KEY} — reading the symbol on the LHS, anyone unfamiliar with the rename history would expect the wire name to be 'analytics.enabled'.
  • The test file at test/commands/settings/analytics-enabled.test.ts keeps the pre-rename filename.
  • The internal setAnalytics method + globalConfig:setAnalytics transport event also keep their pre-rename names.

Each one is fine individually — the wire surface is what the user types and what the schema validates, and the test verifies the post-rename behavior. But a future maintainer grepping for analytics.share (e.g. tracking down a regression report from a user) won't find the constant identifier or the transport event, and a future maintainer grepping for analytics_enabled won't find the user-facing key. Both grep directions take an extra hop.

Two cheap follow-ups (neither blocking now):

  1. Rename the symbol to ANALYTICS_SHARE_KEY (and SETTINGS_KEYS.ANALYTICS_SHARE) — one identifier-rename pass, no behavior change, brings the symbol in step with the value.
  2. Leave a one-line redirect comment here noting the historical rename so the next person sees "this is the wire key for analytics sharing; previously analytics.enabled" without git-blaming the line.

Same observation applies to test/commands/settings/analytics-enabled.test.ts (worth renaming to analytics-share.test.ts when next touched) and to the comment at global-config-handler.ts:212 which mentions analytics_enabled is intentionally NOT tracked — that comment's referent is now the event name (not the setting key), so it's still accurate, but the proximity to the renamed key value invites future confusion.

{
category: 'analytics',
default: false,
description: 'Send anonymous telemetry to ByteRover. Local tracking is always on.',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: The descriptor sentence here ("Send anonymous telemetry to ByteRover. Local tracking is always on.") was already the right framing on the eleventh-review pass — and now compounds with the ENG-3035 rename. After the rename, the key name itself (analytics.share) carries the trust-separation framing linguistically: "share" makes it obvious to a user inspecting brv settings list that the toggle gates sharing, not local capture. The descriptor sentence reinforces this in plain language.

When PM/legal land the disclosure-md copy, recommend the "What is collected" section open with this descriptor sentence verbatim as its lead paragraph — it's the cleanest way to anchor the local-vs-share distinction the rename was meant to surface, and re-using identical wording across the descriptor + the disclosure body means a user who reads either one first encounters the same mental model. Today's descriptor + tomorrow's disclosure-md staying in lockstep is the operational form of "the toggle name means what it says."

Net effect of the rename + this descriptor + the eventual real disclosure copy: the three surfaces (SETTINGS_REGISTRY description, the key name visible on every CLI surface, the markdown rendered before consent) all carry the same one-sentence claim about what the toggle does. That's the kind of three-way agreement that doesn't accidentally drift even if PM/legal touch one copy file later.


Place the ask _after_ the "Either way, you're set" close, as a single follow-up message — not bundled into Message 3:

> "One optional ask before you go: if you'd like to help us improve ByteRover, you can opt in to share anonymous usage telemetry — things like which commands ran and how long they took. No query content, file contents, or memory is ever sent. You can change your mind anytime with `brv settings set analytics.share false`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observation (verbal-script gap re-evaluated after rename, non-blocking): The eighth-pass review flagged three gaps between this verbal script and the canonical analytics-disclosure.md (super-properties undersold, cross-device alias omitted, "anytime" overpromises). The ENG-3035 rename addresses one of the three:

  1. "Two promises" framing. The post-rename verbal command form brv settings set analytics.share false makes the toggle's scope (sharing vs. local capture) linguistically obvious — the user no longer has to infer that disabling only stops sharing. The eighth-pass finding O ("two promises framing vs. code reality") is now further mitigated: a user typing the disable command sees the verb "share" rather than the verb "enabled," so even without reading the descriptor description they reach the right mental model.

Still unresolved:

  1. Super-properties undersold. "Things like which commands ran and how long they took" omits device_id (a persistent device-level identifier), os, node_version, environment. ENG-3019 + ENG-3020 also added space_id / team_id, which are backend-resolvable identifiers — the script doesn't mention them either.

  2. Cross-device alias omitted. The canonical disclosure-md's "Cross-device alias" section says "if you log in on this device, prior anonymous activity here is permanently linked to your account." The verbal script still doesn't mention the permanence + linking commitment. ENG-3019's device_id rotation softens the impact (next anonymous session does not inherit the prior device's history), but the script doesn't surface either fact.

  3. Skip-conditions don't include declined state. Line 297 says "Do not re-ask in future sessions," but line 307's skip-conditions cover only "already enabled" — no persisted declined flag, so a user who declines on tour Feat/init #1 will be re-asked on tour Feat/refactor structure #2 (matching the tour's intentionally-stateless re-invocation contract at line 282).

When PM/legal replace the lorem-ipsum disclosure, recommend bundling a verbal-script copy refresh into the same round so both surfaces stay in step. A user who runs --yes from the agent scrolls past the markdown; a user who declines reads only the verbal script. Both need the same coverage.

device_id is now seeded at daemon bootstrap and before each remote flush, independent of the analytics flag. Previously it was created only when analytics was enabled, so fresh or never-authed users recorded events with an empty device_id that the wire schema and sender dropped, and telemetry never shipped anonymously.

analytics:track dispatch now derives from a single source of truth (ALL_EVENT_SCHEMAS): AnyAnalyticsEvent is derived from it, the handler validates against it and forwards through a segregated IWireEventTracker, replacing the per-event switch. Also fixes silently dropped swarm_* events. No as-casts.
The status text read 'Analytics: enabled/disabled', but local analytics tracking is unconditional (AnalyticsClient.track always records to JSONL); only remote sharing (analytics.share) toggles. Relabelled to 'Analytics sharing: enabled/disabled' so it no longer implies tracking stopped.

Read-only settings rows (e.g. analytics.status) are now expandable in the TUI: Enter opens a detail panel showing the full multi-line value (list rows only render the headline), Esc returns to the list. Adds displayDetail to the SettingsRow view-model, carrying the full block.
isAnalyticsEventName used `value in ALL_EVENT_SCHEMAS`, which matches inherited Object.prototype keys ('toString', 'constructor', '__proto__', ...). Such an event passed the guard, then ALL_EVENT_SCHEMAS[event].safeParse resolved to an inherited member (not a Zod schema) and threw at the wire-side dispatch, escaping the handler. Switch to Object.hasOwn (own-property only) so prototype-key events drop like any other unknown event. Add a regression test that asserts they drop without throwing.

Also refresh the AnalyticsClient class docblock to match the unconditional track() contract: local recording is unconditional, the isEnabled gate lives in flush() (remote send), not track().

Add test/e2e/analytics/mcp-tool-db.e2e.ts (npm run test:e2e:mcp): drives the real 'brv mcp' over JSON-RPC and verifies mcp_session_start / mcp_tool_called / mcp_session_ended land in postgres raw_events across brv-query and brv-curate roundtrips (anonymous), plus device_id rotation on auth:logout.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 30, 2026

Claude finished @bao-byterover's task in 5m 43s —— View job


Claude PR Review — Analytics System (M1 → M17) — Seventeenth pass

Overall score: 3 / 5 — Same engineering substance carrying forward (durable JSONL queue, careful failure-mode taxonomy, exceptional test coverage). The three new commits in this push are the most productive bundle since ENG-2658: one fixes a real anonymous-shipping bug (fresh users silently shipped nothing), one closes the long-standing 48-case dispatch-switch finding from the very first review (E), one closes a security-relevant prototype-pollution guard bypass, and a UX-quality push furthers the trust-separation framing via the "Analytics sharing" label change. Score held at 3/5 because the single ship-blocker from the prior sixteen reviews — lorem-ipsum disclosure body — is still in the tree.

Todos

  • Identify new commits since the sixteenth review (219da05, 1014a9f, b2274ea)
  • Re-check analytics-disclosure.md (16× prior ship-blocker — still lorem ipsum)
  • Spot-check three substantive new commits
  • Re-verify prior unresolved findings (Finding E closed)
  • Post inline issues + summary

What's new since the sixteenth review (3 substantive commits)

  • 219da05 — always-present device_id + single-source analytics:track dispatch. Three layered fixes:

    1. ensureDeviceId() on GlobalConfigHandler seeds the install-level device_id at daemon bootstrap (feature-handlers.ts:204-210) and again before each remote flush (wired through analytics-client.ts:404's new ensureDeviceId?: () => Promise<void> dep), independent of the analytics.share flag. Canonical seeder (ensureDeviceIdOn) is the single place a UUID is generated; doSetAnalytics now reuses it. Closes a real anonymous-shipping bug: previously device_id was only created when analytics.share flipped true, so a fresh / never-authed user recorded events with an empty device_id that the wire schema + sender dropped → silent zero-telemetry for new users.
    2. AnalyticsHandler dispatch consolidation — the 330-line, 48-case switch collapses to a 6-line table-driven dispatch via ALL_EVENT_SCHEMAS. New segregated IWireEventTracker interface (ISP-correct: handler depends on the one method it uses; track<E> test doubles unaffected). Closes Finding E (standing since pass 1) — also incidentally fixes silently dropped swarm_* events (pinned by a new test).
    3. CliMetadata type cleanup — removes the Record<string, unknown> & intersection that was a cast workaround the prior dispatch shape required.
  • 1014a9f — "Analytics: enabled/disabled" → "Analytics sharing: enabled/disabled" + TUI read-only detail panel.

    1. The status snapshot label correctly surfaces the trust-separation semantics (local tracking is unconditional; only sharing toggles). Third compounding piece alongside the analytics.share rename (ENG-3035) and the descriptor honesty at settings.ts:200. Further closes Finding O.
    2. New displayDetail field on SettingsRow carries the full multi-line value (analytics.status is multi-line by nature); list rows still render the headline-only displayCurrent. Enter on a read-only row opens an overlay; Esc returns.
  • b2274ea — prototype-pollution guard + MCP-tool DB e2e + docblock refresh.

    1. Security-relevant fix. isAnalyticsEventName switched from value in ALL_EVENT_SCHEMAS to Object.hasOwn(...). A client sending {event: 'toString'} previously passed the guard (in walks prototype chain) and crashed the handler at safeParse (Function.prototype.toString has no .safeParse). Excellent regression test walks five prototype keys and asserts both no-throw AND no-track-reached.
    2. AnalyticsClient class docblock refresh — explicitly documents the unconditional track() contract and includes "Do NOT reintroduce an isEnabled() gate here: that was the bug where fresh/never-authed users recorded nothing" — the kind of historical-failure-mode comment that forward-defends against drift.
    3. MCP-tool DB e2e — 690 lines driving real brv mcp over JSON-RPC, verifying the mcp_* trio + device_id rotation on logout. npm run test:e2e:mcp.

New findings on this pass

  1. analytics-disclosure.md — ship-blocking, seventeenth flag. No change. Bundle the lorem-ipsum body fix in both src/shared/assets/analytics-disclosure.md and src/webui/features/analytics/constants.ts:13-39 in the same PM/legal copy round.
  2. analytics-handler.ts:62 — praise. Finding E closed (standing since 1st review). 330 lines → 6 lines.
  3. events/index.ts:174 — praise (security). Prototype-pollution guard bypass closed.
  4. global-config-handler.ts:100 — praise + question. ensureDeviceId bootstrap+per-flush+canonical-seeder is the right shape; open question on whether a bootstrap+first-flush race is worth pinning with a test.
  5. format-analytics-status.ts:63 — praise + observation. "Analytics sharing" label closes Finding O further; JSON envelope's enabled key is unchanged — defensible (script callers) but worth a one-line decision record on text-vs-JSON asymmetry.
  6. settings-page.tsx:476 — praise + small observation. Detail overlay is the right UX shape; the displayCurrent fallback in renderDetailOverlay is structurally unreachable from the current Enter binding (Enter only routes readonly-info rows there).
  7. analytics-client.ts:94 — praise. Class-docblock refresh documents the unconditional track() contract and names the historical bug ("Do NOT reintroduce…") that the prior isEnabled() gate caused.

Prior findings — status

# File Finding Status
A analytics-disclosure.md + webui/features/analytics/constants.ts:13-39 Lorem-ipsum body ship-blocker, 17th flag
B shared/constants/privacy.ts:6 PRIVACY_POLICY_URL annotated (placeholder until M1.5) user-facing copy resolved; URL inconsistency vs. WebUI remains
C http-analytics-sender.ts:62-69 No-deviceId returns no reason resolved by ENG-2658
D analytics-hook.ts:88 <outside-project>/${basename} leaks filename leaf resolved by ENG-2658
E analytics-handler.ts dispatch switch 48 cases scaling poorly vs. table-driven dispatch RESOLVED by 219da05 — table-driven via ALL_EVENT_SCHEMAS
F event-names.ts:32 MCP_SESSION_START tense asymmetry unresolved (non-blocking)
G auth-state-store.ts:175 Sequential pre-change listener wait — undocumented unresolved (non-blocking)
H analytics-client.ts:248 Microscopic pendingTracks.add race unresolved (non-blocking)
I migrate-handler.ts:132-140 Brittle string-prefix match resolved by ENG-2658
J synthetic-tool-result-emit.ts:96-109 TOOL_CALL → TOOL_RESULT ordering unresolved (non-blocking)
K format-analytics-status.ts:114 Import-time side-effect registration unresolved (non-blocking)
L webui/pages/configuration/general.tsx:3 Style nit resolved earlier
M noop-analytics-sender.ts vs. no-op-analytics-sender.ts Twin classes one hyphen apart resolved by ENG-2658
N onboarding.md:290-292 Verbal-script scope gap unresolved (bundled into disclosure-body fix round)
O onboarding.md:299-303 "Two promises" framing vs. code reality further mitigated by 1014a9f "Analytics sharing" label
P onboarding.md:305-308 No persisted declined state unresolved (non-blocking)
Q batch.ts:24 schema_version: 1 despite breaking wire change resolved earlier
R stored-record.ts:99-109 toWireEvent derivation caveats resolved earlier
S auth-handler.ts:453 Rotate-after-emit TOCTOU unresolved (non-blocking)
T i-global-config-rotator.ts:19 Non-idempotent contract undocumented unresolved (non-blocking)
14-2 analytics-client.ts:388-390 Rate-limited records bump retry-cap unresolved (non-blocking)
U webui/features/analytics/constants.ts:11 docs.byterover.dev URL drift unresolved (non-blocking)
V webui/features/analytics/constants.ts:13-39 Disclosure body duplicated folded into Finding A
16-NEW settings-keys.ts:13 Value-only rename leaves identifier/value drift unresolved (non-blocking)

What's good in this push

  • Finding E is closed. The 48-case switch was first flagged on the 1st review pass; it has now collapsed into a 6-line table-driven dispatch backed by ALL_EVENT_SCHEMAS as the single source of truth. AnyAnalyticsEvent, PropsForEvent, PropsArg, the runtime guard, and the handler's per-event validation all derive from the same map. No drift surface.
  • Anonymous-shipping fixed. ensureDeviceId at bootstrap means a fresh / never-authed user's events now carry a real device_id from the very first track, and the per-flush belt-and-suspenders guards against runtime config wipes — both via the same canonical seeder, idempotent and serialized through writeChain.
  • Prototype-pollution guard. Object.hasOwn instead of in is a 7-character fix that closes a class of TypeError-throwing crashes. The regression test discipline (walks five prototype keys, asserts no-throw + no-track-reached) is exactly the shape that prevents the same bug from creeping back in.
  • Trust-separation framing now triple-anchored. Key name (analytics.share) + descriptor description (settings.ts:200 "Local tracking is always on") + status snapshot label ("Analytics sharing: enabled/disabled") all linguistically consistent. The eighth-review finding O is meaningfully closer to closed.
  • AnalyticsClient docblock refresh. "Do NOT reintroduce an isEnabled() gate here: that was the bug where fresh/never-authed users recorded nothing and shipped no telemetry" — names the historical failure mode in the place where a future maintainer would reach for the optimization. Forward-defended.
  • MCP-tool DB e2e rounds out the lifecycle-db.e2e coverage that previously couldn't exercise clientType === 'mcp' paths. The brv-query / brv-curate roundtrips run LLM-free (deterministic) so the test is reproducible without provider credentials.

Items outside the inline scope (carried forward)

  • WebUI typecheck still broken per PR description — confirm @campfirein/byterover-packages bump before merge or husky pre-push blocks.
  • build:ui not re-verified — gated on the same package bump.
  • x-byterover-session-id header in axios-analytics-http-client.ts:81 — backend contract note still pending.

Verdict

One ship-blocker, unchanged for the seventeenth review in a row. Replace lorem-ipsum disclosure copy in both src/shared/assets/analytics-disclosure.md and src/webui/features/analytics/constants.ts:13-39 with PM/legal-approved text. Resolve the byterover.dev/privacy vs. docs.byterover.dev/privacy URL drift (Finding U) in the same round. Every other finding is recommended-but-not-blocking.

The three new commits represent the most substantive review-response bundle since ENG-2658: Finding E (standing since the 1st review) is fully closed, the anonymous-shipping bug for fresh users is fixed at root, and a security-relevant prototype-pollution guard bypass is closed with a regression test. The disclosure-body lorem ipsum is now the only substantive item between this branch and merge.
· branch: proj/analytics-system-tool-mode

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Event names and
super properties (`device_id`, `cli_version`, `os`, `node_version`,
`environment`) are recorded. No content of your queries, files, or
memory is collected.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (ship-blocking — seventeenth flag): Unchanged for the seventeenth review in a row. Every section body still opens with literal Lorem ipsum. The three new commits in this push (219da05, 1014a9f, b2274ea) close another long-standing prior-review item (the 48-case dispatch switch consolidates to a single table-driven dispatch via ALL_EVENT_SCHEMAS), fix a real anonymous-shipping bug (device_id was previously created only when analytics was enabled, so fresh users silently shipped nothing), and add a security-relevant prototype-pollution guard — but the disclosure body itself remains the sole substantive item standing between this branch and merge-eligibility.

Four formal consent surfaces (oclif text-mode brv settings set analytics.share true, oclif text+--yes, TUI confirm-disclosure overlay, agent-driven --yes from the onboarding skill) plus the WebUI Privacy panel default-render this exact lorem ipsum text.

The fix is two files (this markdown + src/webui/features/analytics/constants.ts:13-39 for parity) — no external dependency, no submodule bump, no agent-team handoff. With the ENG-3035 rename to analytics.share, the eventual PM/legal round can lean on the key name itself to carry the trust-separation framing — making the rewrite tighter than it would have been before.

Fix this →

} catch {
// Defensive: never crash the emitter.
}
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise — Finding E closed (was standing since the 1st review). The 330-line, 48-case dispatch switch has collapsed to a clean 6-line two-layer validation. The first-review suggestion was exactly the shape that landed: ALL_EVENT_SCHEMAS[event].safeParse(properties ?? {}) + table-driven forward. The introduction of IWireEventTracker as a segregated interface (ISP) is the right composition move — the transport handler depends only on the single method it uses (trackEvent), so the many IAnalyticsClient test doubles that emit via the typed track<E> path are unaffected.

The properties ?? {} injection at line 54 is a subtle correctness detail worth noting: events with no required properties (e.g. daemon_start) had to be special-cased in the old switch (track(name) with no args); now safeParse({}) validates them uniformly, which is exactly the simplification the first-review comment suggested. The pinned test update at analytics-handler.test.ts:107-112 ("recorded event-level properties are an empty object" instead of "undefined") makes the new contract loadbearing.

The handler's docblock at lines 16-35 is exemplary — names the two validation layers, names the single source of truth, names the silent-drop contract. Future maintainers won't need to spelunk for "why doesn't this throw on malformed input?".

// guard, and `ALL_EVENT_SCHEMAS[event]` would then resolve to an inherited
// Object.prototype member (a Function, not a Zod schema) whose `.safeParse`
// is undefined — throwing at the wire-side dispatch. Own-property only.
return typeof value === 'string' && Object.hasOwn(ALL_EVENT_SCHEMAS, value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise (security-relevant fix). Switching from value in ALL_EVENT_SCHEMAS to Object.hasOwn(...) closes a prototype-pollution-adjacent guard bypass that was genuinely exploitable: a client sending {event: 'toString', properties: {x: 1}} would have passed the guard ('toString' in obj is truthy because toString is inherited from Object.prototype), and then ALL_EVENT_SCHEMAS['toString'] would resolve to Function.prototype.toString — a Function whose .safeParse is undefined — throwing a TypeError that escaped the handler. Same for constructor, valueOf, hasOwnProperty, __proto__, etc.

The docblock comment at lines 169-173 names the failure mode precisely; the regression test at analytics-handler.test.ts:127-152 walks five of the prototype keys and asserts both "must not throw out of the handler" AND "must not reach track" — pins both the safety and the security properties simultaneously. The kind of defense-in-depth fix where a single 7-character change (inObject.hasOwn) closes a class of bugs that no integration test would have caught organically.

non-blocking observation: Worth noting in the docblock that this is also the only place in the handler chain where untrusted client input names a JS property — every other field in the wire payload (properties: Record<string, unknown>) goes through Zod parse, which already rejects prototype-key collisions. So the guard here is the single load-bearing line for prototype-key safety on the entire analytics:track surface.

() => {},
() => {},
)
return next
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise (anonymous-shipping correctness fix). The ensureDeviceId + bootstrap seeding + per-flush belt-and-suspenders triple is the right shape for the bug the commit message describes: previously device_id was only created on the path that flipped analytics.share true, so a fresh / never-authed user recorded events with an empty device_id, the wire schema and sender then dropped them on shipping, and anonymous telemetry was silently broken. With this push:

  1. Bootstrap path (feature-handlers.ts:204-210) seeds the id BEFORE AnalyticsClient is constructed and BEFORE any track() can fire, independent of analytics.share and of auth state.
  2. Per-flush defense (wired through analytics-client.ts:404 via ensureDeviceId?: () => Promise<void> dep) guards against a config wiped at runtime — idempotent + serialized via writeChain so concurrent calls converge on one id.
  3. Single seeder (ensureDeviceIdOn at line 287) is the canonical "seed if missing" — no divergent randomUUID() paths. doSetAnalytics now reuses it (line 238), closing the prior duplicate generation surface.

The "Seeding the id is NOT consent to ship" framing in both docblocks (lines 87 and 209) is exactly right — pre-emptively answers the privacy reviewer who'll wonder whether seeding an identifier on a default-off install constitutes telemetry. It does not; the id never leaves the device until analytics.share is true.

non-blocking question: Is there a race between the bootstrap-time seed and the first flush() if the daemon is started with analytics.share already true? setupFeatureHandlers awaits ensureDeviceId() synchronously before constructing AnalyticsClient, so the bootstrap path is sequential — but the per-flush ensureDeviceId path is on the WriteChain, which means a bootstrap seed + a flush firing within the same tick will queue the flush's redundant ensureDeviceId after the bootstrap's, paying an extra globalConfigStore.read(). Cheap to verify it's a no-op when the bootstrap already won, but worth a one-line test pinning the invariant.

if (!response.enabled) return 'Analytics sharing: disabled'

return [
'Analytics sharing: enabled',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise (closes Finding O further). Relabeling Analytics: enabled/disabledAnalytics sharing: enabled/disabled is the third compounding piece of the trust-separation framing fix (alongside the analytics.share key rename in ENG-3035 and the descriptor honesty at settings.ts:200). With this push, all three CLI-facing surfaces — the key name, the descriptor description, AND the status snapshot label — now consistently surface that the toggle gates SHARING, not local capture. A user who reads any one of them encounters the same mental model.

The inline comment at lines 56-59 makes the reasoning explicit ("local analytics tracking is unconditional (AnalyticsClient.track always records to JSONL), so the toggle this reflects is SHARING") — the kind of code-comment that compounds with the AnalyticsClient docblock refresh in b2274ea ("Do NOT reintroduce an isEnabled() gate here") to forward-defend the invariant against drift.

observation (non-blocking): The JSON envelope at formatAnalyticsStatusJson (lines 80-97) still uses enabled as the JSON wire key for the same field. The text label changed; the JSON key didn't. Defensible (avoids breaking --format json scripted callers), but a future PM/legal review might want the JSON envelope to also linguistically separate "tracking enabled" from "sharing enabled" once the disclosure-md round lands. Worth a one-line decision record on intentional asymmetry between text vs. JSON surfaces.

</Text>
</Box>
</Box>
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise + small observation: The renderDetailOverlay shape closes a real UX gap — analytics.status is multi-line by nature (enabled flag + last-flush timestamp + queue depth + dropped count + backoff state + endpoint), and the list row could only show the headline ('Analytics sharing: disabled' or the first line). Enter now opens the full block; Esc returns to the list (the comment at lines 460-462 makes the back-stack scope explicit: "does NOT exit the page").

The split between displayCurrent (single-line headline for the table) and displayDetail (full block for the overlay) at format-settings.ts:144-152 keeps the table layout aligned without losing information — a future read-only key with a 3-paragraph snapshot (think connectors.health) lands mechanically without re-architecting.

non-blocking observation: displayDetail is undefined on editable rows, and renderDetailOverlay correctly falls back to displayCurrent for that case (line 469). But the Enter binding at lines 199-206 routes ONLY type === 'readonly-info' rows into detail mode — editable rows still go through the edit path. So the fallback in renderDetailOverlay is structurally unreachable from the current binding. Defensive code that future maintainers won't need to debug is fine, but a one-line "fallback covers a future caller that routes editable rows into detail mode" comment would prevent the question.

* track; re-enabling ships the backlog on the next flush. So `track()`
* never calls `isEnabled()` and is never a no-op. (Do NOT reintroduce an
* `isEnabled()` gate here: that was the bug where fresh/never-authed users
* recorded nothing and shipped no telemetry.)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise (regression-prevention docblock). Lines 88-95 spell out the unconditional track() contract three ways: (1) what's unconditional (local recording), (2) what's gated (remote send via flush()), and (3) what the disabled-then-enabled boundary semantics are (backlog ships on next flush). The "Do NOT reintroduce an isEnabled() gate here: that was the bug where fresh/never-authed users recorded nothing and shipped no telemetry" sentence is the rare kind of code comment that names the historical failure mode — exactly what a future maintainer needs to resist the temptation to "optimize" a disabled session by no-op'ing track().

Combined with the per-flush docblock at lines 96-101 ("flush() calls isEnabled() synchronously"), the symmetry is now load-bearing: track() never calls isEnabled(), flush() always does. The cache-must-be-populated invariant on flush() also fails loudly rather than miscounting silently — exactly the right calibration for a bootstrap-misconfig surface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants