Skip to content

Engine API stalls, DB read tx timeout + engine::tree SendError, op‑node forkchoice timeouts #863

@artemrootman

Description

@artemrootman

We observe recurring stalls in the Engine API (reth) causing op-node forkchoiceUpdated/newPayload to time out and l2_unsafe to grow. CPU/
RAM/IO look normal. This happens before we restart containers; after restart op-node re-syncs.

Version: ghcr.io/base/node-reth:v0.14.3

Symptoms (op-node log):

  • Repeated Post "http://reth:8551": context deadline exceeded while inserting payloads.
  • Failed to share forkchoice-updated signal and Engine temporary error.
  • op-node eventually starts EL sync again after restart.

Reth log evidence

  • engine::tree::payload_processor::multiproof → "read transaction has been timed out" (DatabaseErrorInfo, code -96000).
  • Followed by engine::tree "Failed to send internal event: SendError" spam.
  • Also Invalid block on new payload with blob gas used mismatch: got 0, expected …
  • After restart, reth logs: waiting for first Flashblock and could not process Flashblock ... recently restarted or syncing.

Impact

  • op_node_default_refs_time alert fires (>200).
  • l2_unsafe grows until we restart containers.

Expected

  • Engine API remains responsive; no DB read tx timeout or internal event channel failures under normal load.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions