Optimize DataStore versioning and reduce allocations#68
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 07bd4107cc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| this.notifyAll(msgID); | ||
| this._version++; |
There was a problem hiding this comment.
Move the version bump before notifying subscribers
When a live frame is ingested after useAllLatestMessages has subscribed, the subscriber runs during notifyAll while getVersion() still returns the old value, so the hook's currentVersion !== lastVersion guard skips rebuilding the latest-message Map. A dashboard that is mounted before the first/only CAN frame will not show it until a later write, and live data is always one notification behind; the same ordering appears in batch/clear paths, so the version needs to be incremented before notifying.
Useful? React with 👍 / 👎.
| roundedData[key] = rounded === signal.sensorReading | ||
| ? signal | ||
| : { ...signal, sensorReading: rounded }; |
There was a problem hiding this comment.
Copy signals even when rounding is a no-op
For any signal whose reading is already rounded, the stored sample now keeps the caller's signal object by reference. If the decoded message object is reused or mutated after ingestMessage, historical data in DataStore changes as well; the previous implementation always spread each signal and the new immutability test only covers the changed-rounding case, so no-op values still alias the caller.
Useful? React with 👍 / 👎.
|
@Ellaharding1 can you review these optimizations please, thank you |
|
@haoruizhou just a thought on this since |
…version before notifyAll so subscribers see correct version.
…version before notifyAll so subscribers see correct version. (#68)
…tion (#71) * Pulse health mosaic on advance Update status page to visually pulse the newest column when the health mosaic advances and add a unit test for the behavior. * Add setting to toggle plot axis range source DBC and Dynamic y axis selection * Attempt to fix packet indicator; stream overlay update * Accurate speed calc, rpm alias, and loss recovery Overlay speed calculation updated to derive vehicle speed from wheel RPM using gear ratio and wheel radius (replaces naive rpm/500). Telemetry-store rpm/motor_speed signal aliases switched to INV_Fast_Motor_Speed. universal-telemetry-software: add missing_seq_seconds to track when gaps occurred, clear on sequence reset, cap removal of oldest missing entries, record recovery timestamps and use them to mark per-second status correctly, and use a robust TCP read loop that accumulates all chunks. * Add healthy field to /api/health response for Upptime compatibility * Add /api/timescale-health endpoint exposing TimescaleDB data lag for newest season table * Updated docs to reflect the car-side move to systemd * Consolidate WebSocket protocol and add runtime notes * Add agent orientation Update docs * Update docs * Add relay worker and update flight-recorder docs Add a Cloudflare Wrangler/Durable-Object relay backend under flight-recorder/relay-worker (package.json, src/index.ts, tsconfig.json, wrangler.toml) and a README describing deployment and usage as an optional public live-relay for remote viewers. Update flight-recorder documentation and top-level README to mark the app as internal (protected by Cloudflare Zero Trust), clarify store-and-forward behavior, relationship to the old lte-relay approach, optional live-relay semantics (public/tokenless worker, 1 Hz synthetic heartbeat, demo-source behavior), and DBC access/build notes. Also add a LiveRelayService and update WebSocketService and App.tsx to expose guarded controls for WS Relay vs DB forward so live relay can be used independently of database sync. * Add vitest script and LiveRelayService tests * Mark lap-detector as tabled until GPS hardware * Route heartbeat frames and clean session URL TelemetryHandler: treat both UTS (1999) and flight-relay (0x7FD) CAN IDs as diagnostic heartbeats and ingest them under DIAG_MSG_IDS.HEARTBEAT. RadioStatChips now uses the DIAG_MSG_IDS constant instead of a hardcoded '1999' string. Added TelemetryHandler tests to assert UTS and relay heartbeat frames are routed to the heartbeat diagnostic message. LiveRelayService: strip search and hash when building the /session URL to avoid stale query/hash data; updated its test to stub global fetch and ensure sessions can be created from older ingest URLs. Misc: test cleanup unstubs globals after each run. * Move rag_viz.py and rag_vectors.png to server/installer/sandbox * add file-uploader, lap-detector, slackbot to package cleanup matrix * offline bundle: fix image refs from daq-radio to data-acquisition namespace * Add /health endpoint and system health UI Introduce a unified system health endpoint and UI backed by an atomic health snapshot. - README and MACBOOK_DEPLOY: document the new /health endpoint and troubleshooting steps. - src/data.py: add HEALTH_FILE env var; compute detailed health snapshot (components, timescale, redis, UDP listener, car state, version, clock, stats); perform Redis ping and timescale parsing; record uptime and UDP listener bound state; write snapshot atomically to HEALTH_FILE. - src/status_server.py: add /health handler, helpers to load, finalize and synthesize unknown/stale health payloads, and mark stale snapshots when producer_ts is old; introduce HEALTH_STALE_SECONDS config. - status/index.html: replace pipeline WebSocket-driven status with polling /health, add system health UI (CSS/JS) and rendering logic, update metrics/status text and polling cadence. - tests/test_status_server.py: add tests for loading valid, missing, malformed and stale health snapshots and retain 404 test for unknown endpoints. This change enables a simple JSON health probe for monitoring and makes the status page reflect aggregated infrastructure state even when the car is powered off. * Update package references for data-acquisition repo * Use .env.macbook and opt-in compose profiles Switch MacBook deployment to use the committed deploy/.env.macbook via --env-file and make TimescaleDB, media, and cloudflared optional Docker Compose profiles. Add ENABLE_TIMESCALE_LOGGING (default false) to the env file and update docker-compose.macbook-base.yml to honor the env file and include profiles for timescale/media/tunnel. Update docs (top-level README, universal-telemetry-software/README.md, MACBOOK_DEPLOY.md, WHICH_ONE.md, and offline README) to describe the new minimal default stack, show example profile commands, and note which images are optional for offline saves. * Add auto TimescaleDB logging detection Introduce an "auto" mode for ENABLE_TIMESCALE_LOGGING so the service will probe the configured POSTGRES_DSN and enable the Timescale writer only when the DB is reachable. * Add lan_sender.py: simulated LAN-based car for stack testing without hardware (#70) * Optimize DataStore versioning and reduce allocations. Fix increment _version before notifyAll so subscribers see correct version. (#68) * Add one-command macOS installer for base station * Show internal vs external WebSocket client counts Track WebSocket bridge client counts through Redis as total, internal, and external clients. Treat loopback connections such as the local relay upstream as internal so the status page does not count them as dashboard viewers. Refresh the count periodically while the bridge is running, preventing the health snapshot from falling back to "Client count unavailable" after the Redis key expires. Update the status UI to show external/internal counts and add focused regression coverage. * Add MacBook LAN-sender CI job and tests Add a new GitHub Actions job (macbook-lan-sender) to smoke-test the MacBook one-click stack using deploy/docker-compose.macbook-base.yml. The job loads built Docker images, pulls base images, sets up Python, brings up the MacBook stack, starts lan_sender inside the telemetry container, runs pytest against a new test file, and collects/logs artifacts on failure. Also add tests/universal-telemetry-software/tests/test_macbook_base_lan_sender.py which exercise container health, status page, PECAN, telemetry logs, WebSocket CAN messages, and health metrics. Update CI compose validation to include the macbook compose file and add the new job as a dependency for publish-telemetry. * Update telemetry-ci.yml * Make installer cross-platform (Linux/macOS) Add full Linux support and an interactive --hotspot mode to the one-line base-station installer; keep macOS support. The installer now bootstraps git and Docker on Debian/Ubuntu/Raspberry Pi OS, can enable a NetworkManager-based Wi‑Fi AP (wfr-hotspot) and installs a systemd unit for it. CI workflow updated to validate compose from the repo root and run a linux e2e job that syntax-checks the script, validates compose, runs the installer, asserts core containers, checks service ports, verifies idempotency, and ensures non-interactive hotspot behaviour. Documentation (README and MACBOOK_DEPLOY.md) and WHICH_ONE.md updated with Linux/RPi instructions and hotspot notes. Fix .env macbook DBC_HOST_PATH to be repo-root relative and add wfr-hotspot.service file. * Slackbot missing def, ci * Increase batch size and speed up row counting * Add daily DAQ stats report to Lappy Adds !stats command and a 9 AM ET daily auto-post showing rows logged, CAN message count, and testing duration for yesterday and the past 7 days. Pulls from TimescaleDB wfr26 and monitoring tables directly. Also adds psycopg2-binary to slackbot requirements. * Add light/dark theme and toggle to data-downloadaer * Revert "Increase batch size and speed up row counting" This reverts commit 4621ab3. * Align sandbox prompt-guide.txt.example with the active guide and main-branch slicks API - Use generic placeholders (<season_table>, <signal_1>, etc.) instead of team-specific table and signal names so the example is shareable. - Drop the 'monitoring table holds infra metrics' note (team-internal). - Drop the verified Grafana query patterns section (team-internal sensor lists and table names). - Update env-var references: TIMESCALE_TABLE (with TIMESCALE_SEASON fallback) replaces POSTGRES_TABLE; remove INFLUX_* references. - Update slicks examples to match the merged main API: * discover_sensors() now requires start_time and end_time * connect_timescaledb() now has (dsn, schema, table) signature * env-var auto-connect note uses POSTGRES_DSN + TIMESCALE_TABLE - Same overall structure as the active (gitignored) prompt-guide.txt, but generic. * Wire slicks into sandbox via uv and bump default model to MiniMax-M3 Sandbox slicks wiring (matches the slicks v0.3.0 TimescaleDB backend on the timescaledb-migration branch that was merged to main): - server/installer/sandbox/Dockerfile.sandbox: install uv, then use it for both requirements-docker.txt and the editable slicks install from /slicks_src (a docker compose additional_context pointing at the sibling slicks checkout). uv keeps the image layer smaller than pip and the install deterministic. Added a build-time smoke test that imports slicks to catch future regressions early. - server/installer/docker-compose.yml: * sandbox service: pass slicks source as a build-time additional_context and bind-mount the same path at runtime so live source edits are picked up on the next container recreate without an image rebuild (controlled by SLICKS_HOST_PATH env var, default points at /home/ubuntu/projects/slicks). * sandbox service: export POSTGRES_DSN, TIMESCALE_TABLE, TIMESCALE_SEASON, and POSTGRES_TABLE so slicks auto-connects on import. The published PyPI slicks (0.2.3) is the InfluxDB backend and would not be importable as TimescaleDB-aware code; install from local source instead. - server/installer/sandbox/requirements-docker.txt: drop the 'slicks>=0.2.0' PyPI pin. Slicks is now installed editable from the local source above; the pin was pulling the InfluxDB-only release. - server/installer/sandbox/requirements.txt: add psycopg2-binary (slicks depends on it; listing explicitly so the layer order is stable across slicks extras changes). Code-generator Dockerfile fix: - server/installer/sandbox/Dockerfile: the previous version only COPY'd prompt-guide.txt.example, so the active (gitignored) prompt-guide.txt was silently being ignored at every build. Switched to COPY prompt-guide.txt* ./ (with the existing fallback to copy the example when the active file is missing). This restores the team's customization workflow where the active prompt-guide lives in the build context and gets shipped to the container. Model bump: - server/installer/.env.example: ANTHROPIC_MODEL default MiniMax-M2.7 -> MiniMax-M3. - server/installer/sandbox/code_generator.py: same bump in the os.getenv default so a missing env var falls back to M3. Tested end-to-end: code-generator /api/generate-code returns result.status=success with a PNG, slicks.fetch_telemetry() returns 407k rows from a real TimescaleDB window, and the slackbot -> code-gen -> sandbox chain still works. * root: pin npm overrides to address quick-win Dependabot alerts Adds an npm 'overrides' block to the root package.json to force patched versions of five transitive dependencies that had Dependabot alerts: lodash -> ^4.18.0 (CVE-2026-4800, CVE-2026-2950) qs -> ^6.15.2 (CVE-2026-8723) picomatch -> ^2.3.2 (CVE-2026-32233) fast-uri -> ^3.1.2 (CVE-2026-6321, CVE-2026-6322) serialize-javascript -> ^7.0.5 (CVE-2026-34043) Regenerated package-lock.json with 'npm install --package-lock-only' (no node_modules write, no network risk beyond what 'install' needs to resolve). Verified the lock now pins: picomatch 2.3.2 (remaining 4 packages aren't pulled into this lock). * pecan: bump react-router to ^7.15.0 and add npm overrides for quick-win CVEs react-router was a direct dep pinned at ^7.9.3; multiple CVEs (DoS via path expansion, RCE via turbo-stream, XSS via redirects) require >= 7.15.0. Bumping the direct dep range is cleaner than an override (npm errors out on an override that conflicts with a direct dep). Also adds an npm 'overrides' block to force patched transitive deps: lodash -> ^4.18.0 qs -> ^6.15.2 picomatch -> ^2.3.2 fast-uri -> ^3.1.2 serialize-javascript -> ^7.0.5 Regenerated package-lock.json. Verified the lock now pins: lodash 4.18.1 qs 6.15.2 picomatch 2.3.2 fast-uri 3.1.2 serialize-javascript 7.0.5 react-router 7.16.0 * flight-recorder: bump react-router-dom to ^7.15.0 and add npm overrides react-router-dom was a direct dep pinned at ^7.9.3; same CVE class as pecan's react-router (DoS, RCE, XSS via redirects) requires >= 7.15.0. Bumped the direct dep range. Also adds an npm 'overrides' block to force patched transitive deps: lodash -> ^4.18.0 qs -> ^6.15.2 picomatch -> ^2.3.2 fast-uri -> ^3.1.2 serialize-javascript -> ^7.0.5 Regenerated package-lock.json. Verified the lock now pins: lodash 4.18.1 picomatch 2.3.2 fast-uri 3.1.2 serialize-javascript 7.0.5 react-router-dom 7.16.0 * grafana-bridge: add npm overrides for quick-win Dependabot alerts grafana-bridge is a minimal Express app (single dep) but still pulls in the same vulnerable transitive packages. Adds an 'overrides' block to force patched versions: lodash -> ^4.18.0 qs -> ^6.15.2 picomatch -> ^2.3.2 fast-uri -> ^3.1.2 serialize-javascript -> ^7.0.5 Regenerated package-lock.json. Verified qs pinned at 6.15.2. * charging-dashboard: self-updating Slack TUI from Pecan accumulator Adds a charging dashboard that posts one self-updating Slack message (post once, chat.update every 5s, like the file-uploader) while charging the accumulator via the Kvaser bridge on the internal (pecan-dev) build. Bot (server/installer/slackbot): - charge_dashboard.py: TUI renderer (SoC bar, pack/cell/temp/alert lines, per-module sparkline + stats), per-session manager, stdlib HTTP receiver (POST /charging/state behind CF Zero Trust + shared-token auth), heartbeat staleness. - soc_model.py: data-derived SOC/ETA from the wfr26 audit. PackCurrent/BMS SOC are dead sentinels and PackVoltage is absent, so SOC is an OCV->SOC lookup on the limiting cell and ETA integrates a CV taper; phase is from the voltage trend. - Wired into slack_bot.py __main__; compose port/env; Dockerfile. - unittests: 17 (renderer, formatters, session updates, HTTP auth, model). Pecan: - chargingSnapshot.ts: pure, mockable snapshot builder mirroring the accumulator page; sentinel-guards the dead signals; derives pack voltage from the series-cell sum. - useChargingDashboard.ts: 5s poster, gated on VITE_INTERNAL + live Kvaser source (:9081). Never broadcasts the demo relay or replay data. - vitest: 15 (aggregation, alert chips, dead-signal guards, broadcast gate). Co-authored-by: Haorui Zhou <haorui2002@gmail.com> * Address PR review: fix slackbot CI smoke-test + harden charge receiver - slackbot-ci: stub SocketModeRequest/Response on the slack_sdk stub modules (import was failing with 'cannot import name SocketModeRequest') and provision /app/logs so the module-level mkdir doesn't fail on the runner. - slack_bot: require CHARGE_RELAY_TOKEN before starting the charge-dashboard receiver; the port is host-published and must not accept unauthenticated POSTs. - docker-compose.lan-sender-test: set POSTGRES_DSN to the timescaledb service (default DSN points at localhost, which is the base container here). - websocket_bridge: aclose the Redis client via try/finally so a failing set doesn't leak the connection. - redis_utils: correct get_sync_client docstring (default retries=5, ~8s). --------- Co-authored-by: Haorui Zhou <haorui2002@gmail.com>
Add a monotonic _version counter to DataStore (getVersion) and increment it on all mutating operations so subscribers can skip redundant work. Reduce allocations when rounding sensor readings by only cloning signal objects if rounding changes the value (preserving identity for common cases). Update useAllLatestMessages to snapshot the version and avoid unnecessary Map rebuilds, and batch cold-store state into a single object to cut re-renders. Simplify Trace memo usage that was a no-op. Add tests for version tick and ingestMessage data handling (rounding, immutability, defaults). Files changed: DataStore.ts, useDataStore.ts, Trace.tsx, and DataStore.test.ts.