Skip to content

fix(pool): tolerate closing-connection race in checkout (#850)#852

Merged
benoitc merged 1 commit into
masterfrom
fix/850-pool-set-owner-race
May 25, 2026
Merged

fix(pool): tolerate closing-connection race in checkout (#850)#852
benoitc merged 1 commit into
masterfrom
fix/850-pool-set-owner-race

Conversation

@benoitc
Copy link
Copy Markdown
Owner

@benoitc benoitc commented May 25, 2026

Fixes #850.

An intermittent pool GenServer crash when a server closes a pooled keep-alive connection during checkout. find_available returns a connection on is_ready -> {ok, connected}, then checkout calls set_owner as a separate gen_statem call. A tcp_closed processed between the two makes set_owner reply {error, invalid_state} during the closed grace window, which broke the hard ok = match and crashed the pool (it restarts, so the practical impact is a failed checkout plus log noise).

Checkout now handles {error, _} from set_owner and falls through to a fresh connection. The async checkin/prewarm path (set_owner_async) had the same race silently, leaving an already-closed connection briefly in the pool's available map; a pooled connection that has closed now stops on the cast so the monitor drops it. set_owner/2's spec is corrected to ok | {error, invalid_state}.

Thanks to @ashutoshrishi for the report and root-cause analysis.

A checkout that races a server-side close crashed the pool GenServer:
find_available calls hackney_conn:is_ready and returns {ok, connected},
then the checkout did ok = hackney_conn:set_owner(Pid, Requester). The
two are separate gen_statem calls, so a tcp_closed can be processed in
between, leaving set_owner to reply {error, invalid_state} during the
closed grace window and failing the hard match.

Checkout now handles {error, _} from set_owner and starts a fresh
connection. The async checkin/prewarm path (set_owner_async) had the
same race silently: a pooled connection that already closed now stops on
the cast so the pool's monitor drops it instead of handing it out. Spec
for set_owner/2 corrected to ok | {error, invalid_state}.
@benoitc benoitc merged commit b955a7e into master May 25, 2026
9 of 10 checks passed
@benoitc benoitc deleted the fix/850-pool-set-owner-race branch May 25, 2026 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Intermittent Pool GenServer crash on hackney_conn:set_owner/2 failing to match {error, invalid_state} if the connection is closed

1 participant