Skip to content

fix(rethink): attach connection error handler so a dropped connection doesn't crash the process#1000

Open
JohnMcLear wants to merge 1 commit into
mainfrom
fix/rethink-connection-error-handler
Open

fix(rethink): attach connection error handler so a dropped connection doesn't crash the process#1000
JohnMcLear wants to merge 1 commit into
mainfrom
fix/rethink-connection-error-handler

Conversation

@JohnMcLear
Copy link
Copy Markdown
Member

Problem

The rethink analogue of ether/etherpad#7878 (postgres, #998) and the redis case (#999). The rethinkdb Connection is an EventEmitter that re-emits raw socket errors — idle drop, server restart, failover, or a proxy/firewall closing the connection — as an 'error' event (rethinkdb/net.js: this.rawSocket.on('error', err => this.emit('error', err))). With no listener, Node treats it as an uncaught exception and terminates the host process. rethink_db.ts attached none.

Fix

Attach connection.on('error', …) immediately after connect. This driver holds a single connection and does not auto-reconnect, so the fix is crash-prevention only: a dropped connection is logged instead of taking down Etherpad (subsequent queries fail until re-init). Reconnection would be a larger, separate change.

Test (real, not a wiring assertion)

test/rethink/connection-drop.spec.ts starts a rethinkdb container, warms the connection, then destroys the underlying socket — exactly what a failover / proxy idle-timeout surfaces to the client — and asserts the process survives and the handler logged it.

Verified locally that without the handler the test fails with Uncaught Exception: Error: simulated network drop, and passes with it. rethink isn't in the container test matrix, so it runs via a dedicated rethink-resilience CI job (mirrors the postgres one in #998).

Verification

  • pnpm run lint / format:check / ts-check / build — clean
  • vitest run test/rethink/connection-drop.spec.ts — passes with fix, fails (uncaught) without

Audit context (per-driver sweep from #998)

All ueberDB network backends were checked for this "missing error handler → crash on idle/dropped connection" class. One PR per affected driver:

🤖 Generated with Claude Code

… doesn't crash the process

The rethinkdb Connection is an EventEmitter that re-emits raw socket
errors (idle drop, server restart, failover, a proxy closing the
connection) as an 'error' event. With no listener Node treats it as
uncaught and terminates the host process. The rethink driver never
attached one. This is the rethink analogue of the postgres bug in
ether/etherpad#7878.

This driver holds a single connection and does not auto-reconnect, so
the fix is crash-prevention only: a dropped connection is logged instead
of taking down the application (subsequent queries fail until re-init).

- Attach `connection.on('error', …)` right after connect.
- Add test/rethink/connection-drop.spec.ts: warm the connection, destroy
  the underlying socket (as a failover / proxy idle-timeout would), and
  assert the process survives and the handler logged it. Verified locally
  that this FAILS without the handler ("Uncaught Exception: simulated
  network drop") and PASSES with it.
- rethink is not in the container test matrix, so run it via a dedicated
  `rethink-resilience` CI job (mirrors the postgres one).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown

qodo-free-for-open-source-projects Bot commented Jun 1, 2026

Review Summary by Qodo

(Agentic_describe updated until commit d56a8af)

Prevent RethinkDB connection errors from crashing process

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Attach error handler to RethinkDB connection to prevent process crash
• Handler logs dropped connections instead of terminating application
• Add behavioral test simulating network drop and verifying survival
• Add dedicated CI job for RethinkDB resilience testing
Diagram
flowchart LR
  A["RethinkDB Connection"] -->|"raw socket error"| B["'error' event emitted"]
  B -->|"without handler"| C["Uncaught exception"]
  C -->|"Node behavior"| D["Process crash"]
  B -->|"with handler"| E["Error logged"]
  E -->|"application survives"| F["Graceful degradation"]

Loading

Grey Divider

File Changes

1. databases/rethink_db.ts 🐞 Bug fix +12/-0

Add error handler to RethinkDB connection

• Attach connection.on('error', …) listener immediately after connection established
• Handler logs connection errors with stack trace to prevent uncaught exceptions
• Includes detailed comments explaining the EventEmitter behavior and crash prevention

databases/rethink_db.ts


2. test/rethink/connection-drop.spec.ts 🧪 Tests +66/-0

Add connection drop survival test

• New behavioral regression test for connection error handler
• Starts RethinkDB container, warms connection, then destroys underlying socket
• Verifies process survives and error is logged by handler
• Test fails without handler (uncaught exception) and passes with it

test/rethink/connection-drop.spec.ts


3. .github/workflows/ci.yml ⚙️ Configuration changes +28/-0

Add RethinkDB resilience CI job

• Add dedicated rethink-resilience CI job for RethinkDB connection testing
• Job runs connection-drop test against RethinkDB container
• Mirrors postgres resilience testing approach from related audit

.github/workflows/ci.yml


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

qodo-free-for-open-source-projects Bot commented Jun 1, 2026

Code Review by Qodo

Grey Divider

New Review Started

This review has been superseded by a new analysis

Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Jun 1, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0)

Grey Divider


Remediation recommended

1. Test leaves connection open 🐞 Bug ☼ Reliability
Description
test/rethink/connection-drop.spec.ts creates and initializes a Rethink_db driver but never calls
close(), which can leave open handles/listeners and cause vitest to hang or be flaky. Other
integration tests in this repo consistently close DB connections in teardown hooks.
Code

test/rethink/connection-drop.spec.ts[R34-41]

Evidence
The new test stops the container but never closes the database driver, unlike other tests that
always close after each test/run; the driver itself has a close() method that should be used for
cleanup.

test/rethink/connection-drop.spec.ts[34-36]
test/rethink/connection-drop.spec.ts[38-58]
databases/rethink_db.ts[140-144]
test/lib/test_lib.ts[78-80]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`test/rethink/connection-drop.spec.ts` starts a DB driver connection but never closes it. Even though the socket is forcibly destroyed, leaving the driver unclosed can keep event listeners/handles around and cause flakiness or hangs in CI.

### Issue Context
The driver provides a `close(callback)` method, and existing integration tests close their DBs during teardown.

### Fix Focus Areas
- test/rethink/connection-drop.spec.ts[34-65]

### Proposed fix
- Hoist `driver` to outer scope so it’s accessible in `afterAll`.
- In `afterAll`, call `driver.close(...)` (wrapped in a Promise) before stopping the container.
- Ensure teardown runs even if the test fails (Vitest `afterAll` is fine once `driver` is in scope).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

2. No pnpm cache in job 🐞 Bug ➹ Performance
Description
The new rethink-resilience CI job installs dependencies without reusing the pnpm store cache steps
used by the existing build job, increasing CI time and network usage. This makes the new job slower
and inconsistent with the repo’s other CI configuration.
Code

.github/workflows/ci.yml[R58-78]

Evidence
Within the same workflow, the build job sets up pnpm store caching, but the newly added
rethink-resilience job does not, so it will always reinstall dependencies from scratch.

.github/workflows/ci.yml[25-36]
.github/workflows/ci.yml[58-78]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The new `rethink-resilience` job runs `pnpm install` without using the pnpm store cache setup that the `build` job already uses, making the new job slower and more expensive.

### Issue Context
`build` job in the same workflow uses `pnpm store path` + `actions/cache` keyed by `pnpm-lock.yaml`.

### Fix Focus Areas
- .github/workflows/ci.yml[25-36]
- .github/workflows/ci.yml[58-78]

### Proposed fix
- Copy the existing pnpm cache steps from `build` into `rethink-resilience`:
 - Get pnpm store path (id: pnpm-cache)
 - Cache pnpm store via `actions/cache`
- Keep the cache key consistent (`hashFiles('**/pnpm-lock.yaml')`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread .github/workflows/ci.yml
Comment on lines +59 to +78
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v6

- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 10.33.0

- name: Setup Node.js
uses: actions/setup-node@v6
with:
node-version: 24

- name: Install dependencies
run: pnpm install --frozen-lockfile

- name: Connection-drop survival test
run: pnpm exec vitest run test/rethink/connection-drop.spec.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants