Skip to content

feat(stats): add time_field toggle to memories-timeseries chart#1246

Open
aliu-ronin wants to merge 2 commits intovectorize-io:mainfrom
aliu-ronin:feat/timeseries-time-field
Open

feat(stats): add time_field toggle to memories-timeseries chart#1246
aliu-ronin wants to merge 2 commits intovectorize-io:mainfrom
aliu-ronin:feat/timeseries-time-field

Conversation

@aliu-ronin
Copy link
Copy Markdown
Contributor

Summary

The Memories ingested chart on the bank dashboard always buckets by memory_units.created_at (ingest time). For a bank built up in real time, ingest time ≈ event time and that's the right default. But when a corpus is backfilled in a single session — for example migrating from another memory system — every record's created_at collapses to the import moment, so the chart shows "all knowledge is new" and the underlying timeline disappears.

Adds a time_field query parameter (and a three-way UI toggle) that lets the viewer pick which timestamp column drives the bucket assignment:

Value Meaning
created_at (default) Ingest time — matches old behavior exactly
mentioned_at Event time: when the fact was mentioned
occurred_start Event time: when the underlying event started

For the event-time columns we COALESCE(<col>, created_at) per row so records lacking an event timestamp still show up in the chart instead of silently disappearing. The field is whitelisted (never interpolated from untrusted input), unknown values fall back to created_at, and the chosen column is echoed back in the response for UI affordance.

Changes

Backend (hindsight-api-slim)

  • memory_engine.get_memories_timeseries gains a time_field keyword arg, validates it against an allowlist, and composes bucket_expr with COALESCE for the event-time columns.
  • MemoriesTimeseriesResponse adds a time_field field.
  • API endpoint exposes time_field as a query param with a clear description.

Control Plane (hindsight-control-plane)

  • getMemoriesTimeseries SDK helper accepts a timeField argument, defaulting to created_at.
  • /api/stats/[agentId]/memories-timeseries proxy forwards the param.
  • bank-stats-view.tsx renders an Ingested / Mentioned / Occurred segmented control below the period selector; the card title updates so the chart reads unambiguously (e.g. "Memories by mentioned time").

Screenshots

On a bank migrated from another memory system, toggling to Mentioned spreads the chart across the actual event timeline instead of a single ingest-day spike:

  • created_at → two-day spike (04-23, 04-24)
  • mentioned_at → spread across 03-26 through 03-29 (real history)

Test plan

  • curl .../memories-timeseries?period=30d&time_field=mentioned_at returns non-zero buckets spanning the corpus's actual timeline
  • time_field=nonsense falls back to created_at (parity with the invalid-period behavior)
  • Response includes time_field so the UI can display the active dimension
  • Control plane chart updates immediately when the toggle changes, with no stale frames

Notes

`/stats/memories-timeseries` always bucketed by `created_at` (ingest
time). For a bank built up in real time, ingest time ≈ event time and
that's the right default. But when a corpus is backfilled in a single
session — for example migrating from another memory system — every
record's `created_at` collapses to the import moment, so the chart
shows "all knowledge is new" and hides the underlying timeline.

Adds a `time_field` query parameter that lets the caller choose which
timestamp column drives the bucket assignment:

- `created_at` (default, unchanged) — ingest time
- `mentioned_at` — event time (when the fact was mentioned)
- `occurred_start` — event time (when the underlying event started)

For the event-time columns we `COALESCE(<col>, created_at)` per row so
records lacking an event timestamp still show up somewhere instead of
silently disappearing. The field is whitelisted (never interpolated
from untrusted input), unknown values fall back to `created_at`, and
the chosen column is echoed in the response for UI affordance.

Depends on the tz-aware bucket fix in vectorize-io#1245 (kept as a separate commit).
Surfaces the new `time_field` backend option as a three-way toggle next
to the period selector on the "Memories ingested" card:

- **Ingested** — bucketed by `created_at` (default, matches old behavior)
- **Mentioned** — bucketed by `mentioned_at` (event time)
- **Occurred** — bucketed by `occurred_start` (event time)

The card title also updates to reflect which dimension is in view so
the chart reads unambiguously.

Propagates `time_field` through the control-plane proxy
(`/api/stats/[agentId]/memories-timeseries`) and the typed SDK
(`client.getMemoriesTimeseries`). Defaults stay `created_at` everywhere
so behavior is backward-compatible.
Copy link
Copy Markdown
Collaborator

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nicoloboschi
Copy link
Copy Markdown
Collaborator

@aliu-ronin can you resolve conflicts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants