feat(transport): add HTTP retry with exponential backoff#1520
feat(transport): add HTTP retry with exponential backoff#1520
Conversation
|
b083a57 to
a264f66
Compare
|
@sentry review |
|
@cursor review |
|
@sentry review |
|
@cursor review |
|
@cursor review |
|
@cursor review |
243c880 to
d6aa792
Compare
|
@cursor review |
fbbffb2 to
abd5815
Compare
|
@cursor review |
f030cf9 to
22e3fc4
Compare
|
@sentry review |
|
@cursor review |
22e3fc4 to
ffce486
Compare
|
@cursor review |
…nd_task The local `HINTERNET request = client->request` snapshot was only needed for the cancel_client approach. Since shutdown_client only fires at the timeout point, mid-function reads of client->request are safe. Keep only the InterlockedExchangePointer in the exit block to prevent double-close. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Move sentry__atomic_store(&bgw->running, 0) from before the on_timeout callback to the else branch (detach path). This lets the worker's shutdown_task set running=0 naturally after finishing in-flight work, making the dump_queue safety-net reachable if the callback fails to unblock the worker within another 250ms cycle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
…iness Tests that directly call sentry__retry_send were racing with the transport's background retry worker polling the same cache directory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
…e data loss After retry_enqueue writes an envelope and stores its address in sealed_envelope, the envelope is freed when the bgworker task completes. If a subsequent envelope is allocated at the same address and is still pending during shutdown, retry_dump_cb would incorrectly skip it, losing the envelope. Clear sealed_envelope after the first match so later iterations cannot false-match a reused address. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Move cache_keep logic into http_send_task so envelopes are cached only when sending actually fails, rather than unconditionally in process_old_runs. For non-HTTP transports, process_old_runs still handles caching since there is no http_send_task. - Add cache_keep/run fields to http_transport_state_t - Cache in http_send_task when send fails and retry is unavailable - Simplify can_cache in process_old_runs to check transport capability - Fix NULL transport dereference in sentry__transport_flush - Update sentry_options_set_cache_keep/http_retry docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Move envelope caching from process_old_runs (synchronous, before send) into http_send_task (on HTTP failure). This ensures envelopes are only cached when the send actually fails, rather than unconditionally. Add transport cleanup_func to submit cache cleanup as a FIFO task on the bg worker, guaranteeing it runs after pending send tasks from process_old_runs. Add explicit flush argument to sentry_example for deterministic test behavior across platforms with varying HTTP timeout characteristics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Its only production use was the can_cache check in process_old_runs, which was removed in the previous commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
When bgworker_shutdown_cb times out, the thread is detached and sentry_close frees options while the cleanup task may still be queued. Incref options when submitting and use sentry_options_free as the task cleanup function to ensure safe lifetime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Add HTTP retry with exponential backoff for network failures, modeled after Crashpad's upload retry behavior.
Note
Adds an explicit 15s connect timeout to both curl and WinHTTP, matching the value used by Crashpad. This reduces the time the transport worker is blocked on unreachable hosts, previously ~75s on curl (OS TCP SYN retry limit) and 60s on WinHTTP.
Failed envelopes are stored as
<db>/cache/<ts>-<n>-<uuid>.envelopeand retried on startup after a 100ms throttle, and then with exponential backoff (15min, 30min, 1h, 2h, 4h, 8h). When retries are exhausted, and offline caching is enabled, envelopes are stored as<db>/cache/<uuid>.envelopeinstead of being discarded.flowchart TD startup --> sretry{retry?} sretry -->|yes| throttle throttle -. 100ms .-> send send -->|success| discard send -->|fail| retry{retry?} retry -->|no| cache{cache?} cache -->|no| discard cache -->|yes| cache-dump[<db>/cache/<br/><uuid>.envelope] retry -->|yes| cache-retry[<db>/cache/<br/><ts>-<n>-<uuid>.envelope] cache-retry --> backoff backoff -. 2ⁿ×15min .-> sendBuilds upon:
See also: