Skip to content

feat(dev-tools): rotate sessions tool and experiment feature flags#700

Draft
Just-Insane wants to merge 185 commits intoSableClient:devfrom
Just-Insane:feat/developer-tools
Draft

feat(dev-tools): rotate sessions tool and experiment feature flags#700
Just-Insane wants to merge 185 commits intoSableClient:devfrom
Just-Insane:feat/developer-tools

Conversation

@Just-Insane
Copy link
Copy Markdown
Contributor

Summary

Consolidates two developer-tooling branches. Adds a "Rotate Sessions" tool to Developer Tools settings and an Experiments panel for percentage-rollout feature flags, with build-time injection from environment variables.


Changes

feat/devtool-rotate-sessions

  • Add a Rotate Sessions button to the Developer Tools settings panel.
  • Calls prepareToEncrypt() on all joined rooms to force Megolm session rotation, useful for testing encryption key rotation flows.
  • Includes error handling and user feedback via toast/alert.

feat/feature-flag-env-vars

  • deepMerge prototype-pollution fix: block keys like __proto__, constructor, and prototype in the deep-merge utility used to inject runtime config.
  • Build-time config injection: scripts/inject-client-config.js reads CLIENT_CONFIG_* environment variables at build time and merges them into config.json, enabling CI/CD pipelines and GitHub Actions to configure feature flags without modifying the source file.
  • Experiment bucketing (useExperimentVariant): typed helper that deterministically buckets users into experiment variants by percentage rollout, based on user ID hash.
  • ExperimentsPanel: new settings panel under Developer Tools listing active experiments and their variant assignments.
  • useClientConfig test: unit tests covering experiment bucketing logic and the useExperimentVariant hook.

Testing Checklist

  • Developer Tools → "Rotate Sessions": button present, triggers Megolm rotation, error handled gracefully
  • Developer Tools → Experiments panel: visible, lists configured experiments
  • CLIENT_CONFIG_* env vars at build time: merged into runtime config correctly
  • deepMerge with __proto__ key: rejected, no prototype pollution

Changeset

  • feat: Rotate Sessions developer tool
  • feat: Experiment bucketing with env-var-injectable config
  • fix: Prototype-pollution guard in deepMerge

Just-Insane and others added 30 commits May 10, 2026 21:09
- Import KnownMembership from $types/matrix-sdk instead of matrix-js-sdk/lib/types
- Use Promise.allSettled to handle partial failures and report accurate count
- Add window.confirm confirmation before discarding sessions
- Clarify changeset: 'encryption sessions' not 'sessions'
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add scripts/inject-client-config.js which reads HOMESERVER_LIST,
ELEMENT_CALL_URL, EXPERIMENTS and other config keys from the GH Actions
environment and merges them into config.json at build time.
CI workflows pass these through via env; the setup action prints an
injected-config summary in the job summary.
knip.json updated to include the new script as an entry point.
…ntages

useClientConfig.ts gains getExperimentVariant() which deterministically
buckets a userId into a variant using a hash of userId+experimentName, then
checks it against rolloutPercentage. Experiment defaults shape is typed so
all callers get compile-time checking of known experiment names.
ExperimentsPanel shows every experiment name, current variant, rollout
percentage, and whether the user is enrolled — readable without opening
the console. DevelopTools.tsx wires it into the developer settings tab.
- Mount BookmarksFeature in ClientNonUIFeatures so useInitBookmarks()
  is called; bookmarkListAtom was never populated, causing the viewer
  to always show "No Bookmarks Yet" even with data on the server.
- Fix removeBookmark to tombstone any existing item event regardless
  of whether it passes validation, so malformed/orphan bmk_ items
  cannot be resurrected by orphan recovery in listBookmarks.
- Add regression tests: tombstoning malformed item, idempotent tombstoning.
…e() optimistic update

- useInitBookmarks: also trigger loadBookmarks on bookmark item events
  (prefix org.matrix.msc4438.bookmark.*), not just the index event.
  Fixes cross-device sync not updating the UI when sliding sync
  delivers individual item events.
- useBookmarks: remove() now moves the item to bookmarkDeletedListAtom
  optimistically so the Recently Removed section updates immediately.
- Added regression tests for both fixes.
…okmark doc comment

- Wrap grouped bookmark list items in <Fragment key={}> instead of un-keyed <>
- Return null from MessageBookmarkItem when mEvent.getId() is undefined
- Update removeBookmark doc comment to match item-first deletion ordering
Mirrors the fix already applied in useNotificationJumper: when a room
belongs to both the direct-message list and a space, prefer the /direct
route over the space route. Previously useRoomNavigate checked orphan
space parents first, which caused bookmark jumps and room-nav clicks on
DMs-in-spaces to open the room via the space path instead of the direct
path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…croll, fix eventId drag-to-bottom, increase list timeline limit

- useTimelineSync: change auto-scroll recovery useEffect → useLayoutEffect to
  prevent one-frame flash after timeline reset
- useTimelineSync: remove premature scrollToBottom from useLiveTimelineRefresh
  (operated on pre-commit DOM with stale scrollSize)
- useTimelineSync: remove scrollToBottom + eventsLengthRef suppression from
  useLiveEventArrive; let useLayoutEffect handle scroll after React commits
- RoomTimeline: init atBottomState to false when eventId is set, and reset it
  in the eventId useEffect, so auto-scroll doesn't drag to bottom on bookmark nav
- RoomTimeline: change instant scrollToBottom to use scrollToIndex instead of
  scrollTo(scrollSize) — works correctly regardless of VList measurement state
- slidingSync: increase DEFAULT_LIST_TIMELINE_LIMIT 1→3 to reduce empty previews
  when recent events are reactions/edits/state

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore scrollToBottom call in useLiveEventArrive with instant/smooth
based on sender, add back eventsLengthRef and lastScrolledAt suppression,
restore scrollToBottom in useLiveTimelineRefresh when wasAtBottom, and
revert instant scrollToBottom to scrollTo(scrollSize) matching upstream.

The previous changes removed all scroll calls from event arrival handlers
and relied solely on the useLayoutEffect auto-scroll recovery, which has
timing issues with VList measurement. Upstream's pattern of scrolling in
the event handler and suppressing the effect works reliably.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove behavior parameter from scrollToBottom — always use
  scrollTo(scrollSize) matching upstream. The smooth scrollToIndex
  was scrolling to stale lastIndex (before new item measured),
  leaving new messages below the fold.
- Revert auto-scroll recovery from useLayoutEffect back to useEffect
  (matches upstream). useLayoutEffect fires before VList measures
  new items and before setAtBottom(false) in eventId effect.
- Remove stale scrollCacheForRoomRef that referenced missing imports.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Just-Insane and others added 28 commits May 10, 2026 21:40
Linux/Codespaces-clean branch — removes macOS NVM lazy-loader, Homebrew
paths, macOS-only OMZ plugins, and hardcoded macOS gitconfig paths.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…elay

When navigator.serviceWorker.controller is null — which happens when a
window is opened by a notification tap before the SW has claimed it, or
transiently during a SW update cycle — postMessage calls were silently
dropped.  This caused the decryption-result reply to never reach the SW,
so the 5-second timeout fired and the notification either showed
"Encrypted message" or was skipped entirely.  Subsequent notifications
after the first tended to fail because the tab that received the push
was often the freshly-opened uncontrolled window.

The same null-controller bug affected enablePushNotifications /
disablePushNotifications: if the togglePush message was dropped, the
homeserver pusher state was never updated, so no further pushes arrived.

Fix: mirror the pattern already used by SyncNotificationSettingsWithSW
and setNotificationSettings in the same file — send to both controller
(fast path, usually correct) and registration.active (belt-and-suspenders
via ready.then).  Double delivery is safe: the SW's decryptionPendingMap
deduplicates pushDecryptResult, and duplicate pusher SET/DELETE requests
to the homeserver are idempotent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Duplicate const introduced during integration merge causing build failure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The integration rebuild partially included perf/timeline-rendering — it
got the 'align initial room-fill thresholds' and 'stabilize bottom pin'
commits but missed the 'feat(timeline): restore room cache by event anchor'
commit chain. That commit introduced several refs and a utility module that
the later commits depend on, causing a ReferenceError crash on room open:

  readyBlockedByPaginationRef is not defined (already restored earlier)
  saveRoomScrollStateRef / currentScrollFingerprintRef (type-only — no-op without wiring)
  prevScrollSizeRef (content-grow detection in handleVListScroll)
  isReadyRef (scroll-event guard during init)
  RoomScrollFingerprint type (return type of buildRoomScrollFingerprint)

Restore:
- src/app/utils/roomScrollCache.ts (deleted during merge)
- import { roomScrollCache, RoomScrollCache, RoomScrollFingerprint,
  RoomScrollPosition } from roomScrollCache
- const prevScrollSizeRef = useRef(0)
- const isReadyRef = useRef(false); isReadyRef.current = isReady
- const saveRoomScrollStateRef / currentScrollFingerprintRef

The full scroll-position restore wiring (roomScrollCache.save/load,
saveRoomScrollState callback, restoreRoomScrollPosition) is not yet wired
up — the saveRoomScrollStateRef.current stays undefined so those call sites
are no-ops — but all runtime references are now defined and the app no
longer crashes on room open.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- RoomTimeline: remove unused TIMELINE_ANCHOR_SELECTOR, buildRoomScrollFingerprint,
  currentScrollFingerprintRef (incomplete perf/timeline wiring stubs)
- RoomTimeline: narrow import to RoomScrollCache only (RoomScrollFingerprint unused)
- sw-session: remove void operator to satisfy no-void rule
- Auto-fixed prettier formatting in PushNotifications, ClientNonUIFeatures,
  useProcessedTimeline, BackgroundNotifications

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add Discord-style message grouping where consecutive messages from the
same sender within a configurable time window collapse the sender header.

- Add messageGroupingThreshold to Settings type (default: 2 min)
- Replace hardcoded 2-min threshold in useProcessedTimeline with the setting
- Wire useSetting in RoomTimeline and pass to useProcessedTimeline
- Add MessageGrouping component in Experimental settings with 5 options:
  2 min (default), 5 min, 15 min (Discord-style), 30 min, 60 min

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previously skipWaiting() was called automatically in the install event,
causing every new deployment to immediately activate the new SW mid-session.
When users then navigated to a lazy-loaded route, the old chunk hash no
longer existed on the server → 404 → chunk-error handler → silent reload.
This presented as the app 'randomly restarting' after deployments.

Changes:
- Remove auto-skipWaiting() from SW install handler
- Add SKIP_WAITING_AND_CLAIM message handler — the update prompt already
  sends this message; now the SW acts on it with self.skipWaiting()
- Deduplicate the double navigator.serviceWorker.register() call in
  index.tsx (was registering twice with different options in dev mode)
- Move sendSessionToSW into the single registration .then() handler so
  the session is sent as soon as the registration resolves, and keep the
  .ready fallback for the case where registration already settled

The new SW will now sit in 'waiting' state until the user confirms the
update prompt ('A new version of the app is available. Refresh?').
Existing sessions are uninterrupted until then.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previously all three room list virtualizers used `key={vItem.index}`
(position-based). When a room moves position (e.g. a DM jumps to the
top after a new message), React reuses the component at that index
rather than remounting it. This caused `useRoomHasUnread`'s lazy
`useState` initializer to retain stale state from the previous room at
that index — resulting in the unread dot appearing on the wrong room
until the next Timeline/Receipt event fired.

Fix: add `getItemKey` to each `useVirtualizer` config so TanStack
Virtual assigns stable room-ID-based keys, then use `vItem.key` on
`<VirtualTile>`. React now correctly unmounts/remounts when a different
room occupies a position, so all per-room state is fresh.

- Direct.tsx: getItemKey = sortedDirects[index]
- Home.tsx:   getItemKey = sortedRooms[index]
- Space.tsx:  getItemKey = hierarchy[index]?.roomId ?? index

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Matrix presence is per-user on the server, not per-device. When an idle
device sends `setPresence({ presence: 'unavailable' })`, the shared
server state changes for all clients. The active device's PresenceFeature
only re-sends its state when `autoIdled`, `presenceMode`, or
`sendPresence` changes — none of which fire on the active device, so the
idle device permanently 'wins' until the user switches tabs or interacts.

Fix: add a 2-minute heartbeat that re-asserts `{ presence: 'online' }`
Within one heartbeat cycle the active device wins back the server state.
The heartbeat is idle-free (stops when autoIdled or mode changes), so it
doesn't fight against intentional DND/offline/idle changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r errors

Previously, the pusher was only re-registered with Sygnal on visibility
changes (backgrounding/foregrounding). If the browser rotated the push
subscription endpoint while the app was closed, there was a window from
app open until the first background event where pushes would silently
fail (Sygnal 410s on the stale endpoint).

Also, the togglePusher Promise was fire-and-forget — unhandled rejections
from enablePushNotifications (e.g. pushManager.subscribe() failure due to
transient network error) were silently swallowed, leaving the pusher in an
unknown state with no log trail.

Fixes:
- Call togglePusher once immediately on mount with the current visibility
  state so the endpoint is always current on app startup
- Wrap all togglePusher calls with .catch(debugLog.warn) so failures are
  surfaced in the debug log and Sentry

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… zero

FaviconUpdater's badge-clearing logic only runs inside the roomToUnread
useEffect. If roomToUnread reached highlightTotal=0 while the app was
hidden (background sync from another device), document.visibilityState
was not 'visible' so clearAppBadge was skipped. When the user then
foregrounds the app roomToUnread hasn't changed so the effect doesn't
re-run and the stale badge stays set.

Fix: add a visibilitychange listener that clears the badge on foreground
whenever highlightTotalRef is 0. A ref tracks the latest value so the
listener doesn't need to close over the roomToUnread effect.

Also suppress the pre-existing no-console lint warning on the push relay
error path with an inline disable comment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two fallback paths in getUnreadInfo were accidentally dropped during a
checkpoint commit (2eaefb19d):

1. PushProcessor fallback: when SDK total/highlight are both 0 but
   roomHaveUnread() confirms there ARE unread events (stale SDK counters),
   walk the live timeline with PushProcessor to compute real counts.

2. No-receipt sliding sync fallback: rooms that have never been visited
   in sliding sync have no read receipt, making SDK counts unreliable.
   If the timeline has activity from other users, return total=1 (or the
   existing SDK counts when present) so the dot badge appears.

These fallbacks are essential for non-DM rooms: DMs benefit from
shouldForceDMHighlight which forces highlight=total, but regular rooms
rely entirely on these paths when SDK counts are zero or stale.

Restores parity with the dev branch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove the 24h TTL check from loadPersistedSession(). Sessions cached
  before the persistedAt field was added (age=Infinity) were being
  incorrectly rejected, causing every push notification to fall back to
  the generic 'New Message' text until the user opened the app again.
  Matrix access tokens are long-lived; if a token is truly revoked the
  downstream 401 handling in handleMinimalPushPayload provides the
  graceful fallback already.

- Wrap event.data.json() in a try/catch with a 'New Message' fallback
  notification. A malformed push payload previously caused an unhandled
  rejection in onPushNotification, which could silently drop the push
  event on iOS.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On iOS PWA, visibilityState can get stuck at 'visible' after the user
opens the app via a notification tap and then backgrounds it again.
This caused all subsequent push notifications to be silently suppressed
("first notification works, second and beyond don't").

The fix: require BOTH visibilityState === 'visible' AND client.focused.
The focused property transitions to false immediately when the user
presses the home button (before visibilitychange fires), so it is the
reliable signal that the user is actually looking at the app.

Also log focused state in the existing debug output.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a user opens a room via notification tap and loadEventTimeline
encounters an error (e.g. network not yet ready after iOS app resume),
the onError handler resets to the live timeline but never called
setIsReady(true). This left the timeline permanently invisible
(opacity: 0), producing a blank white screen.

Add onJumpError callback to UseTimelineSyncOptions. RoomTimeline passes
handleJumpError which calls setIsReady(true). The error handler in
useTimelineSync now calls onJumpError() after restoring the live
timeline and scrolling to bottom, so the room becomes visible.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After iOS kills a backgrounded PWA and the app cold-starts, SyncStatus
was showing a Connecting... banner for up to 20 seconds after the loading
screen disappeared. This is because the banner fired on any transition
where previous !== SyncState.Syncing, including the initial
null→PREPARED→SYNCING cold-start sequence.

With sliding sync the second long-poll can take up to 20 s (the poll
timeout) when there are no new messages, keeping the banner visible the
entire time even though the app is fully loaded and usable.

Fix: only show Connecting... when recovering from an actual disconnect
(previous === Reconnecting or previous === Error). The loading screen
already covers the initial connection; the banner is only meaningful
when we are re-connecting after a network disruption.

State transitions:
  - null → PREPARED: no banner (loading screen handles it) ✓
  - RECONNECTING → PREPARED/SYNCING: shows banner ✓
  - ERROR → PREPARED/SYNCING: shows banner ✓
  - SYNCING → SYNCING: no banner ✓

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The SW has two paths where it suppresses OS notifications when the app
appears visible:

1. hasVisibleClient (already fixed): requires both visibilityState === 'visible'
   AND client.focused from clients.matchAll().

2. Decrypt relay result (this commit): after requestDecryptionFromClient(),
   the code checked only result?.visibilityState === 'visible' to decide
   whether the in-app UI was already showing the message.

On iOS PWA, document.visibilityState can get stuck at 'visible' after the
user backgrounds the app because visibilitychange doesn't always fire.
document.hasFocus() is updated by the browser process immediately when the
OS removes window focus (home button press), so it is the reliable signal.

Add appFocused: document.hasFocus() to both pushDecryptResult reply payloads
in ClientNonUIFeatures.tsx.  In sw.ts, require BOTH visibilityState ===
'visible' AND appFocused === true before skipping the OS notification,
mirroring the same dual-guard logic in hasVisibleClient.

This closes the second suppression path that was silently dropping background
push notifications on iOS after the user had previously opened the app from a
notification tap.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous change narrowed the 'Connecting...' banner to only show
on reconnect/error. Reverting to upstream's condition:

  stateData.previous !== SyncState.Syncing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant