Problem
Today txob tracks per-handler completion state in handler_results, but retry/backoff is event-scoped via a single events.backoff_until timestamp.
In src/processor.ts, when any handler errors the processor computes a backoff list:
- per-handler custom backoff via
TxOBError.backoffUntil
- plus the default
backoff(event.errors)
…and then sets:
lockedEvent.backoff_until = max(backoffs)
Because this is a single column, one slow/rate-limited handler can delay reprocessing for all other remaining handlers (even if those other handlers could run sooner with a different backoff policy).
We want to decouple handler processing / fanout scheduling so we can configure:
- handler-specific backoff strategies (e.g. webhook handler vs email handler)
- handler-specific retry counters / max errors
- (optionally) handler-specific concurrency/priority in the future
Current behavior (references)
TxOBError explicitly documents that the processor uses the latest (maximum) backoff among handlers: src/error.ts.
processEvent() collects backoffs and sets event-level backoff_until: src/processor.ts.
- The canonical schema is a single
events table with handler_results JSONB + backoff_until TIMESTAMPTZ: README.md.
Goals / success criteria
- Per-handler backoff without forcing unrelated handlers to wait.
- Keep at-least-once semantics and existing handler idempotency story.
- Preserve good query performance (indexable “due work” query).
- Prefer additive/backwards-compatible migration paths where possible.
Design options
Option A (recommended): Separate table for handler work items
Introduce a new table that materializes handler fanout and scheduling:
event_handlers (or event_handler_jobs)
event_id (FK)
event_type
handler_name
status (pending|processed|unprocessable)
attempts (or errors)
backoff_until TIMESTAMPTZ NULL
processed_at TIMESTAMPTZ NULL
unprocessable_at TIMESTAMPTZ NULL
last_error (optional) / error_history (optional)
- timestamps
Processing model:
- Poll/query due handler rows:
processed_at IS NULL
unprocessable_at IS NULL
backoff_until IS NULL OR backoff_until < now()
attempts < maxAttempts(handler)
- Lock row
FOR UPDATE SKIP LOCKED and execute one handler.
- Update only that handler row (its backoff, attempts, status).
- Mark the parent
events.processed_at when all handler rows are done (processed or unprocessable) OR when a policy says to stop.
Indexes:
(processed_at, unprocessable_at, backoff_until, attempts) with a partial index where processed_at IS NULL AND unprocessable_at IS NULL.
Pros:
- True per-handler scheduling/backoff with an index-friendly due-work query.
- Clean foundation for future features (priorities, per-handler concurrency, dead-lettering per handler).
Cons:
- Requires schema changes + migration story.
- Requires defining how/when to create handler rows (on enqueue vs on first processing attempt).
Option B: Keep single events table, add per-handler scheduling inside handler_results
Store backoff_until, attempts, etc. per handler in the JSONB.
Processing model:
- When an event is locked, run only handlers whose JSONB indicates they are due.
- Compute event-level “next wakeup” as the minimum next handler backoff (so the event row remains queryable by a single timestamp).
Pros:
Cons:
- Hard to query/index “events with at least one handler due” without expensive JSONB scans.
- Hard to evolve cleanly; JSON shape becomes part of the storage contract.
Option C: Split by handler into separate events (“fanout events”)
When processing an event, create child events like UserCreated.sendWelcomeEmail.
Pros:
- Reuses existing event queueing/backoff.
Cons:
- Amplifies event volume; complicates correlation/observability.
- Harder to treat the original event as “done” only when all children are done.
Open questions
- When to materialize handler rows?
- On insert (requires knowing handler map at enqueue time) vs on first processing (requires deriving from handler map at runtime).
- Potential approach: materialize on first lock of the parent event.
- Where do handler-specific configs live?
- API:
handlerMap value could become { handler, backoff?, maxErrors?, ... }.
- Back-compat: accept bare function as today.
- How do event-level
errors / maxErrors change?
- With per-handler attempts,
events.errors may become less meaningful; it could become “processor-level attempts” or be deprecated.
Proposed next steps
- Spike Option A with a minimal Postgres client implementation + migration SQL.
- Decide API shape for handler-specific policy (backoff/maxErrors).
- Add docs for migration from JSONB-only tracking.
Problem
Today
txobtracks per-handler completion state inhandler_results, but retry/backoff is event-scoped via a singleevents.backoff_untiltimestamp.In
src/processor.ts, when any handler errors the processor computes a backoff list:TxOBError.backoffUntilbackoff(event.errors)…and then sets:
lockedEvent.backoff_until = max(backoffs)Because this is a single column, one slow/rate-limited handler can delay reprocessing for all other remaining handlers (even if those other handlers could run sooner with a different backoff policy).
We want to decouple handler processing / fanout scheduling so we can configure:
Current behavior (references)
TxOBErrorexplicitly documents that the processor uses the latest (maximum) backoff among handlers:src/error.ts.processEvent()collectsbackoffsand sets event-levelbackoff_until:src/processor.ts.eventstable withhandler_results JSONB+backoff_until TIMESTAMPTZ:README.md.Goals / success criteria
Design options
Option A (recommended): Separate table for handler work items
Introduce a new table that materializes handler fanout and scheduling:
event_handlers(orevent_handler_jobs)event_id(FK)event_typehandler_namestatus(pending|processed|unprocessable)attempts(orerrors)backoff_until TIMESTAMPTZ NULLprocessed_at TIMESTAMPTZ NULLunprocessable_at TIMESTAMPTZ NULLlast_error(optional) /error_history(optional)Processing model:
processed_at IS NULLunprocessable_at IS NULLbackoff_until IS NULL OR backoff_until < now()attempts < maxAttempts(handler)FOR UPDATE SKIP LOCKEDand execute one handler.events.processed_atwhen all handler rows are done (processed or unprocessable) OR when a policy says to stop.Indexes:
(processed_at, unprocessable_at, backoff_until, attempts)with a partial index whereprocessed_at IS NULL AND unprocessable_at IS NULL.Pros:
Cons:
Option B: Keep single
eventstable, add per-handler scheduling insidehandler_resultsStore
backoff_until,attempts, etc. per handler in the JSONB.Processing model:
Pros:
Cons:
Option C: Split by handler into separate events (“fanout events”)
When processing an event, create child events like
UserCreated.sendWelcomeEmail.Pros:
Cons:
Open questions
handlerMapvalue could become{ handler, backoff?, maxErrors?, ... }.errors/ maxErrors change?events.errorsmay become less meaningful; it could become “processor-level attempts” or be deprecated.Proposed next steps