Skip to content

1.154.0 by Claude#61

Closed
tulir wants to merge 44 commits into
upstream-1.154.0from
beeper-1.154.0-claude
Closed

1.154.0 by Claude#61
tulir wants to merge 44 commits into
upstream-1.154.0from
beeper-1.154.0-claude

Conversation

@tulir

@tulir tulir commented Jun 10, 2026

Copy link
Copy Markdown
Member

No description provided.

@indent

indent Bot commented Jun 10, 2026

Copy link
Copy Markdown
PR Summary

Rebases the long-running Beeper fork onto upstream Synapse v1.154.0. The latest force-push (a55d17f9e) is a CI-only change: it bumps stale GitHub Actions and runtime versions to unblock the test-sytest and test-complement jobs that were dying on infra issues against v1.154's PEP-621 pyproject.toml. All earlier rebase fixes (Tuple/Dict imports, run_as_background_process, FilteredEvent wrapping in sync previews, Clock.sleep/looping_call switching to Duration, BeeperStore self.clock, FrozenEvent test import, tests/server.py zeroing SLEEP_AFTER_TASK_S) are retained from ac821ab1c.

  • .github/workflows/beeper-ci.yaml: bump actions/checkout@v2\u2192v6, actions/setup-python@v2\u2192v6, actions/upload-artifact@v4\u2192v7, docker/setup-qemu-action@v2\u2192v4, docker/setup-buildx-action@v2\u2192v4, docker/login-action@v2\u2192v4 across all five jobs; drops the stale moby/buildkit:v0.10.6 driver-opts pin from build-python
  • .github/workflows/beeper-ci.yaml: bumps lint/test runners from Python 3.12 \u2192 3.13
  • .github/workflows/beeper-ci.yaml: switches the sytest container from matrixdotorg/sytest-synapse:bullseye to :bookworm so the in-container Poetry can parse v1.154's requires-poetry / [project] block
  • .github/workflows/beeper-ci.yaml: adds pip install poetry && poetry install to the test-complement "Install dependencies" step so scripts-dev/complement.sh finds poetry on the host runner

Issues

4 potential issues found:

  • In beeper_aggregate_notification_counts, max_stream_ordering = max(orders) returns a 1-tuple (rows from RETURNING event_stream_ordering are sequences), then the UPDATE parameter (max_stream_ordering,) becomes ((1234,),) and the driver tries to bind a tuple into a BIGINT column. This trigger only fires when experimental.beeper_user_notification_counts_enabled is turned on, but when it does the failure is silently swallowed by the surrounding except OperationalError and the stream-ordering watermark never advances. → Autofix
  • SetBeeperDone log line still reads body.get('at_order') at the top level after commit d493df2ed moved the actual lookup to body['done']['at_order'], so the log message always shows at_order=None regardless of what the client sent. → Autofix
  • content = content = {"event_id": event_id, **(extra_content or {})} in the read-marker handler is a double-assignment typo \u2014 functionally equivalent to a single content = ..., just messy. → Autofix
  • synapse/rest/client/receipts.py passes the entire request body as extra_content, which leaks bookkeeping fields (com.beeper.allow_backward, thread_id) into stored/federated receipt data and lets a client override the server-generated ts. The matching read-marker endpoint correctly uses the narrower body.get("com.beeper.read.extra", None) \u2014 these two should match. → Autofix
7 issues already resolved
  • BeeperStore uses self._clock.looping_call(...) and await self._clock.sleep(1.0), but SQLBaseStore exposes the clock as self.clock (set on line 59 of synapse/storage/_base.py). Mypy flags both spots as attr-defined; at runtime this only fires when experimental.beeper_user_notification_counts_enabled is on, but then every code path that touches the aggregation loop will AttributeError. (fixed by commit a97acf4)
  • tests/push/test_push_rule_evaluator.py uses FrozenEvent(...) at lines 1086, 1100, and 1180 but never imports it (the import dropped during rebase), so the entire test file fails to collect with NameError: name 'FrozenEvent' is not defined. (fixed by commit fa8d426)
  • synapse/handlers/device.py:195 calls bare run_as_background_process(...) but only wrap_as_background_process is imported in this file, so any worker with run_background_tasks=true and a configured delete_stale_devices_after will crash with NameError at DeviceHandler.__init__. (fixed by commit 7831797)
  • synapse/util/task_scheduler.py:494 calls await self._clock.sleep(TaskScheduler.SLEEP_AFTER_TASK_S) with the bare int 1, but the upstream Clock.sleep API was changed to require a Duration; the call will raise TypeError after every scheduled task once the post-task sleep is exercised. (fixed by commit ff062bf)
  • The Beeper preview de-duplication in synapse/handlers/sync.py:2965 does if ev.event_id == preview_event_id:, but batch.events is now list[FilteredEvent] (upstream wrapped events with per-user membership). FilteredEvent has no event_id attribute, so when beeper_previews=true is set and the preview event sits in the current batch this raises AttributeError. (fixed by commit 5abe91b)
  • synapse/rest/client/account_data.py uses Tuple[int, JsonDict] on the two new Beeper servlets but Tuple is no longer imported (upstream PR #19111 dropped it during the rebase), so importing the module raises NameError: name 'Tuple' is not defined at class-body evaluation and synapse cannot start. (fixed by commit f8487c5)
  • synapse/push/bulk_push_rule_evaluator.py annotates _should_count_as_unread's related_events parameter as Dict[str, Dict[str, Any]], but Dict is not imported and there is no from __future__ import annotations, so loading the module (via server.py) raises NameError: name 'Dict' is not defined and the homeserver cannot start. (fixed by commit a97acf4)

CI Checks

lint-style, lint-types, build-python, and test-trial are all green on a55d17f9e. The infra-fix workflow changes worked: both sytest and complement now actually run synapse and execute tests instead of dying in setup. Two individual tests fail, neither blocking and neither caused by the code changes in this branch:

  • test-sytest (27274635537): 814 passed, 1 failed \u2014 Test 281 "Real non-joined user can call /events on world_readable room" timed out waiting for a polling /events call. The previous run (27274632295) had this same test pass on the same SHA, so it's a flake. Not retried, the rest of sytest is healthy.
  • test-complement (both runs): one subtest fails: TestThreadsEndpoint \u2014 room_threads_test.go:89: HaveInOrder: index 0 got [thread1] want [thread2, thread1]. The test sends two thread roots + replies and immediately GETs /_matrix/client/v1/rooms/.../threads; synapse returns only the first thread. Most likely a race against the threads table being populated. Two Beeper-only changes in this PR can plausibly worsen the race (and are not present upstream): Disable bundled aggregations (synapse/handlers/relations.py:436 short-circuits get_bundled_aggregations to {}, which get_threads calls during serialization) and Temporarily reduce scheduled task load (MAX_CONCURRENT_RUNNING_TASKS = 2 plus SLEEP_AFTER_TASK = Duration(seconds=1) in synapse/util/task_scheduler.py). Worth a rerun to confirm flake vs. regression \u2014 if it reproduces, the thread root count gap points at the throttled scheduler, not the aggregations stub.
Failing test-sytest
  • Single test timeout (Test 281 Real non-joined user can call /events on world_readable room): the test polls /events for an event that never arrives within the timeout. The very same SHA passed test-sytest on the parallel run 27274632295, so this looks like a flake. Suggest re-running the job; nothing in the PR diff touches /events, world_readable rooms, or peeking.
Failing test-complement
  • TestThreadsEndpoint failed with HaveInOrder: index 0 got [thread1] want [thread2, thread1] — the /threads endpoint returned only 1 of 2 thread roots immediately after the events were sent. Same failure on the parallel run, so it's reproducible on this SHA. Most likely a propagation race against the threads table that the Beeper-only MAX_CONCURRENT_RUNNING_TASKS = 2 + SLEEP_AFTER_TASK = Duration(seconds=1) throttle in synapse/util/task_scheduler.py makes more likely to hit. The Disable bundled aggregations short-circuit (synapse/handlers/relations.py:436) is a less likely cause since it only affects per-event aggregation content, not the chunk count, but worth ruling out. Suggest a complement rerun to confirm reproducibility, then consider raising MAX_CONCURRENT_RUNNING_TASKS back to upstream defaults (5) for CI or temporarily setting TaskScheduler.SLEEP_AFTER_TASK = Duration(seconds=0) in the complement image, mirroring what tests/server.py already does for the in-process test reactor.

⚡ Autofix All Issues

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4205ef88-5b38-4011-b32f-55ca9b255718

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch beeper-1.154.0-claude

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread synapse/rest/client/account_data.py Outdated
Comment thread synapse/push/bulk_push_rule_evaluator.py Outdated
user_id=requester.user,
event_id=event_id,
thread_id=thread_id,
extra_content=body,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Receipt payload pollution + client-controllable ts. Passing the entire request body as extra_content means the receipt data computed in handlers/receipts.py:198-201 becomes {"ts": int(self.clock.time_msec()), **body} — so:

  1. thread_id (already pulled out and passed separately) ends up duplicated inside the receipt content.
  2. com.beeper.allow_backward (a protocol flag, not user-visible data) is stored and federated as part of the receipt.
  3. Any client can set ts in the body and override the authoritative server-generated timestamp, which other servers/clients will then see as truth.

The companion synapse/rest/client/read_marker.py:109,119 already handles this correctly by reading from a Beeper-namespaced sub-object:

extra_content=body.get("com.beeper.fully_read.extra", None),
...
extra_content=body.get("com.beeper.read.extra", None),

Mirror the same pattern on both branches here (line 134 for FULLY_READ, line 143 for the regular receipts path).

Suggested change
extra_content=body,
extra_content=body.get("com.beeper.read.extra", None),

user_id, room_id, "com.beeper.inbox.done", done
)
logger.info(
f"SetBeeperDone done_delta_ms={delta_ms} at_order={body.get('at_order')}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log statement reads the wrong path. Commit d493df2ed ("Look for done.at_order in the correct place in inbox_state endpoint") moved the real lookup to body["done"]["at_order"] two lines above, but this log line was left reading body.get('at_order') at the top level. As a result it always prints at_order=None, which defeats the whole purpose of the debug line. Use body['done'].get('at_order') to match the lookup.

Suggested change
f"SetBeeperDone done_delta_ms={delta_ms} at_order={body.get('at_order')}"
logger.info(
f"SetBeeperDone done_delta_ms={delta_ms} at_order={body['done'].get('at_order')}"
)

logger.info("No user counts aggregated")
return 0

max_stream_ordering = max(orders)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latent: binds a tuple into a BIGINT column when the experimental flag is enabled. txn yields rows as sequences, so orders looks like [(123,), (456,), ...]. max(orders) therefore returns (456,), and the parameter list two lines below — (max_stream_ordering,) — ends up as ((456,),), binding a tuple where the SQL expects a scalar BIGINT. The resulting OperationalError is swallowed by the except block lower down, so the beeper_user_notification_counts_stream_ordering watermark would silently fail to advance and the aggregation would never make progress.

Gated on experimental.beeper_user_notification_counts_enabled (default False) and pre-existing since beeper-1.140, so production isn't hit today — flagging because the rebase keeps it in 1.154. Fix is one line:

max_stream_ordering = max(order[0] for order in orders)
Suggested change
max_stream_ordering = max(orders)
max_stream_ordering = max(order[0] for order in orders)


if should_update:
content = {"event_id": event_id}
content = content = {"event_id": event_id, **(extra_content or {})}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: content = content = {...}. Python evaluates the RHS once and assigns to content twice — no behavior change, just drop the duplicate assignment.

Suggested change
content = content = {"event_id": event_id, **(extra_content or {})}
content = {"event_id": event_id, **(extra_content or {})}

Comment thread synapse/handlers/device.py Outdated
Comment thread synapse/handlers/sync.py Outdated
Comment thread synapse/util/task_scheduler.py Outdated
Comment thread synapse/storage/databases/main/beeper.py Outdated
Fizzadar and others added 25 commits June 10, 2026 14:41
Annoying! Will lave that to upstream and we'll get updates as we rebase.
Signed-off-by: Sumner Evans <sumner.evans@automattic.com>
Still broken because the opentracing module upstream uses has long been
retried and setuptools broke it.
- reduce hardcoded task parallelism
- add hardcoded post task sleep
@tulir tulir force-pushed the beeper-1.154.0-claude branch from a22f8da to ac821ab Compare June 10, 2026 11:42
@tulir tulir force-pushed the beeper-1.154.0-claude branch from ac821ab to a55d17f Compare June 10, 2026 11:59
@tulir tulir closed this Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

6 participants