Skip to content

feat(active_job): support for tracing#2947

Open
solnic wants to merge 39 commits into
masterfrom
2933-active-job-tracing-specs
Open

feat(active_job): support for tracing#2947
solnic wants to merge 39 commits into
masterfrom
2933-active-job-tracing-specs

Conversation

@solnic

@solnic solnic commented May 7, 2026

Copy link
Copy Markdown
Collaborator

This branch extends Sentry's ActiveJob instrumentation so distributed tracing works across Resque, DelayedJob, and Sidekiq. Instead of per-adapter integrations, tracing is now driven through a single common ActiveJob extension (Sentry::Rails::ActiveJobExtensions), which hooks serialize / deserialize / perform_now to:

  • emit a producer span on enqueue,
  • propagate trace context (and, with PII enabled, an allowlisted user context) through the job payload under a single _sentry key,
  • record messaging span data on the consumer transaction,
  • isolate the Sentry hub per worker thread.

Because the behavior lives in one extension, every adapter is verified against the same set of shared AJ specs — adding a new adapter is just a context + a one-line it_behaves_like.

New config

config.rails.active_job_propagate_traces (default true) gates trace propagation through the job payload.

New shared specs

A spec wires up the shared backend harness, an adapter-specific context, and the shared example groups.

Resque:

RSpec.describe "Sentry + ActiveJob on the resque adapter", type: :job do
  include ActiveSupport::Testing::TimeHelpers
  include_context "active_job backend harness", adapter: :resque
  include_context "resque adapter"

  it_behaves_like "a Sentry-instrumented ActiveJob backend"
  it_behaves_like "an ActiveJob backend that supports distributed tracing"
end

Sidekiq:

RSpec.describe "Sentry + ActiveJob on the sidekiq adapter", type: :job do
  include_context "active_job backend harness", adapter: :sidekiq
  include_context "sidekiq adapter"

  it_behaves_like "a Sentry-instrumented ActiveJob backend"
  it_behaves_like "an ActiveJob backend that supports distributed tracing"
end

The "...supports distributed tracing" shared example is itself a meta-group that pulls in the individual checks, so each adapter gets full coverage for free:

RSpec.shared_examples "an ActiveJob backend that supports distributed tracing" do
  it_behaves_like "an ActiveJob backend that emits a producer span on enqueue"
  it_behaves_like "an ActiveJob backend that propagates trace context through the job payload"
  it_behaves_like "an ActiveJob backend that records messaging span data on the consumer transaction"
  it_behaves_like "an ActiveJob backend that propagates Sentry user context through job payloads"
  it_behaves_like "an ActiveJob backend that isolates Sentry context per worker thread"
end

E2E coverage

This branch add more E2E specs (spec/features/active_job_tracing_spec.rb) that exercises the full path: the Svelte mini app's "Trigger Job" button fetches POST /jobs on the Rails mini app, the browser SDK propagates sentry-trace + baggage, and we assert that the browser fetch, the controller http.server transaction, the queue.publish producer span, and the queue.active_job consumer transaction all share one trace (consumer parented on the publish span, matching messaging.message.id).

To support worker-based adapters, spec/apps/rails-mini/worker.rb boots the same Rails + Sentry app in a separate process and runs the job through whichever adapter SENTRY_E2E_ACTIVE_JOB_ADAPTER selects (:async/:inline in-process, or a real worker for :sidekiq, :resque, :delayed_job) — so the same e2e assertions run across adapters via the CI matrix.

Screenshots

Distributed tracing works OOTB now with any Active Job adapter 🎉

Arc 2026-06-18 12 57 37 Arc 2026-06-18 12 57 14

@solnic solnic changed the title 2933 active job tracing specs feat(tests): active job tracing specs May 8, 2026
@solnic solnic force-pushed the 2933-active-job-tracing-specs branch 3 times, most recently from 6309f45 to 03ad132 Compare May 12, 2026 11:33
@solnic solnic force-pushed the 2933-active-job-tracing-specs branch 11 times, most recently from 826f69d to 1c0f001 Compare May 22, 2026 14:42
@solnic solnic changed the title feat(tests): active job tracing specs feat(active_job): support for tracing Jun 1, 2026
@solnic solnic force-pushed the 2933-active-job-tracing-specs branch 2 times, most recently from c9da666 to d072fd5 Compare June 12, 2026 09:36
@solnic solnic marked this pull request as ready for review June 16, 2026 11:14
@solnic solnic force-pushed the 2933-active-job-tracing-specs branch from d65c086 to 49aa99c Compare June 22, 2026 08:02
@solnic solnic requested review from dingsdax and sl0thentr0py June 22, 2026 09:50

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Want reviews to match your repository better? Bugbot Learning can learn team-specific rules from PR activity. A team admin can enable Learning in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 332bf25. Configure here.

Comment thread spec/apps/rails-mini/worker.rb
Comment on lines +108 to +113
rescue StandardError => e
raise if enqueued

log_producer_span_error(e)

run_enqueue.call

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The exception handling in record_producer_span may re-raise Sentry-internal errors that occur after a job is successfully enqueued, potentially crashing the enqueue process.
Severity: LOW

Suggested Fix

Refine the exception handling to be more specific. Instead of a broad raise if enqueued, only re-raise exceptions that occur before the enqueue.call is executed. This can be achieved by moving run_enqueue.call outside the begin/rescue block for Sentry operations or by using a more granular flagging system to distinguish between pre-enqueue and post-enqueue Sentry errors.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: sentry-rails/lib/sentry/rails/active_job.rb#L108-L113

Potential issue: In `record_producer_span`, the exception handling logic is too broad.
If an error occurs within Sentry's internal `with_child_span` method *after* the job has
been successfully enqueued (for example, during span finalization), the `rescue` block
will catch it. However, because the `enqueued` flag is already true, the condition
`raise if enqueued` will re-raise the exception. This violates the intended behavior of
preventing Sentry-internal errors from crashing the job enqueue process, potentially
causing the application to fail to enqueue a job that was otherwise successfully
processed.

Did we get this right? 👍 / 👎 to inform future reviews.

@solnic solnic force-pushed the 2933-active-job-tracing-specs branch 2 times, most recently from 8ed5c5b to 5360bcf Compare June 25, 2026 14:03
solnic and others added 3 commits June 26, 2026 11:26
Sets messaging.message.id, messaging.destination.name,
messaging.message.retry.count, and messaging.message.receive.latency
on the consumer transaction, mirroring sentry-sidekiq's middleware.

Adds an opt-in shared example that adapters can include to verify
the data fields are populated correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps ActiveJob enqueue with a `queue.publish` child span when an
active parent transaction exists, mirroring sentry-sidekiq's client
middleware. Uses the public `around_enqueue` callback so no new
ActiveJob monkey-patching is introduced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the OpenTelemetry pattern (the only documented way to add
metadata to an ActiveJob payload — Rails has no public extension hook
for serialize/deserialize): prepends the existing ActiveJobExtensions
module with serialize/deserialize overrides that inject and recover
sentry-trace and baggage headers under a namespaced "_sentry" key,
wrapped in rescue blocks so a Sentry bug never breaks job execution.

Threads the deserialized headers into SentryReporter.record, which now
uses Sentry.continue_trace when present so the consumer transaction
shares the producer's trace_id and chains under the producer
queue.publish span.

Guards the around_enqueue producer-span registration against duplicate
registration (each Test::Application.define re-runs the railtie and
without idempotency this stacks dozens of nested queue.publish spans).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
solnic and others added 29 commits June 26, 2026 11:26
Set messaging.message.retry.count unconditionally, derived from
job.executions, to mirror sentry-sidekiq which pulls the count
directly from the job hash. The previous gate of rescue_handlers.any?
was imprecise: that list also includes plain rescue_from declarations,
which do not trigger retries. A job declaring only rescue_from would
still emit retry.count, while the absence of any handler would suppress
it even though executions is still a meaningful signal. Removing the
gate makes the attribute consistent and unambiguous across adapters.

Co-Authored-By: github-copilot <noreply@example.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirrors Sidekiq's propagate_traces flag. When set to false, trace
propagation headers are not injected into the serialized job payload
and the consumer starts a new unconnected transaction.

Defaults to true (existing behaviour is preserved).

Co-Authored-By: github-copilot <noreply@example.com>
Mirror Sidekiq's scope enrichment: set a 'queue' scope tag and an
'active_job' context block (job_class, job_id, queue, provider_job_id)
on every event captured within the consumer scope, including the
transaction and any captured errors.

Co-Authored-By: github-copilot <noreply@example.com>
Co-Authored-By: github-copilot <noreply@example.com>
This correctly handles all execution modes:
- Dedicated async workers (new thread, nil hub): clone -> restore nil
- Inline inside a Rack request (rack hub on thread): clone -> restore
  rack hub so the HTTP response completes normally
- Thread-pool workers (recycled thread, stale hub): clone -> restore
  stale hub (irrelevant; next job will clone again)

Co-Authored-By: github-copilot <noreply@example.com>
Co-Authored-By: github-copilot <noreply@example.com>
The harness embedded :test-adapter specifics — the Rails 5.2 payload-
preservation shim, the drain loop, and the enqueued-payload accessor.
It also reached past ActiveJob::TestHelper to set
ActiveJob::Base.queue_adapter directly, which conflicts with TestHelper's
own _test_adapter slot (TestHelper's before_setup runs outside our around
hook, so any direct assignment is silently shadowed).

Switch the harness to ActiveJob's official queue_adapter_for_test hook
and a small set of abstract methods (queue_adapter_for_test,
with_adapter_active, drain, last_enqueued_payload, boot_adapter,
reset_adapter) that adapter contexts implement. The :test-adapter
shared context now owns everything specific to TestAdapter — including
the Rails52FullPayloadTestAdapter shim and the drain loop. Subsequent
adapter backends (e.g. Sidekiq) can compose with the harness without
fighting it.

Generalises the one shared-example line that reached into the
TestAdapter shape (trace_propagation) via last_enqueued_payload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…adapter

Runs the common ActiveJob spec suite end-to-end against
ActiveJob::QueueAdapters::SidekiqAdapter, driven by
Sidekiq::Testing.fake! (block form, public API) and Sidekiq::Job.drain_all.
Validates that the AJ tracing extension works as a generic, adapter-
agnostic instrumentation — independent of sentry-sidekiq's native
middleware.

The :sidekiq context plugs into the harness via queue_adapter_for_test
(installing a SidekiqAdapter instance through ActiveJob::TestHelper) and
with_adapter_active (wrapping example.run in Sidekiq::Testing.fake! so
fake mode is scoped per-example without touching global state).

The context deliberately does not load sentry-sidekiq: loading it would
install Sidekiq's client/server middleware globally and register
SidekiqAdapter in skippable_job_adapters, both of which would
short-circuit the AJ extension we're exercising.

Sidekiq becomes a sentry-rails dev dependency, gated on Rails version
(Sidekiq 7+ doesn't support Rails 5.2). The spec file and support file
no-op cleanly on older matrices where the gem isn't bundled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drives the svelte-mini app to click a new "Trigger Job" button, which
fetches POST /jobs/sample on the rails-mini app. The browser SDK
propagates sentry-trace + baggage to the Rails request; the AJ
extension this branch adds emits a queue.publish span on the
http.server transaction at enqueue, and a queue.active_job consumer
transaction when the :async pool runs the job. The spec asserts all
three rails-side artifacts share one trace and are correctly linked
(sentry-trace header on the controller request, parent_span_id on the
consumer transaction, and matching messaging.* data on the producer
and consumer ends).

Polls the shared envelope log because :async runs the job on a
separate thread, so the HTTP response returns before the consumer
transaction is recorded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The harness was calling make_basic_app in its around-each block,
which creates a fresh Rails::Application subclass and runs every
initializer on each example. With 98 AJ examples that overhead dwarfed
the actual test work — and worse, it left behind state (Sidekiq's
@config_blocks list, accumulated routes, lingering Rails::Application
subclasses) that made each subsequent make_basic_app a little
slower. Under Ruby 3.4 + Rails 8.1.3 the per-example time grew 3× over
the run, pushing the full sentry-rails CI past the 15-min timeout.

Hoist make_basic_app to before(:all) and replicate the per-example
bits of Sentry::Rails::Railtie's after_initialize hook in the around
block — re-init Sentry, re-activate tracing / structured logging,
re-register the AJ event handlers. The one-time extensions
(controller methods, streaming reporter, backtrace cleanup, etc.) were
already installed by the initial make_basic_app and persist for the
group.

Also memoize the SidekiqAdapter instance in the :sidekiq context.
Each SidekiqAdapter.new appended to Sidekiq's internal @config_blocks
list and added an on(:quiet) callback; creating a fresh adapter per
example was unnecessary global churn.

Result: spec/active_job goes from 33s → 0.8s, and the full
sentry-rails spec task (Ruby 3.4 + Rails 8.1.3) goes from 9:22 to
2:31 — well under the CI limit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scope reads allowed user keys as symbols, so we gotta symbolize users that
we receive as plain data from the job payload.
@solnic solnic force-pushed the 2933-active-job-tracing-specs branch from 5360bcf to 036a44d Compare June 26, 2026 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant