Skip to content

fix: async setup() and predict() run on different event loops#2927

Merged
michaeldwan merged 1 commit intomainfrom
md/fix-async-event-loops
Apr 10, 2026
Merged

fix: async setup() and predict() run on different event loops#2927
michaeldwan merged 1 commit intomainfrom
md/fix-async-event-loops

Conversation

@michaeldwan
Copy link
Copy Markdown
Member

Async predictors that create event-loop-bound resources in setup() -- httpx.AsyncClient, aiohttp.ClientSession, asyncio.Queue, anything that binds to the running loop -- crash on 0.17 because async setup() ran via asyncio.run() (ephemeral loop, destroyed after setup finishes) while async predict() runs on a separate persistent shared loop via run_coroutine_threadsafe.

The fix submits async setup coroutines to the same shared event loop that predict uses, so both phases share one loop. Sync setup is unchanged. The non-worker path (cog predict CLI) keeps using asyncio.run() since it doesn't persist resources across calls.

# before
same_loop=False item=from-setup

# after
same_loop=True item=from-setup

Fixes #2926

Async predictors that create event-loop-bound resources in setup()
(httpx.AsyncClient, aiohttp.ClientSession, etc.) crash because setup()
ran via asyncio.run() on an ephemeral loop that's destroyed, while
predict() runs on a separate persistent shared loop.

Submit async setup coroutines to the shared event loop via
run_coroutine_threadsafe instead, so both phases share the same loop.
The non-worker path (cog predict CLI) keeps using asyncio.run().

Fixes #2926
@michaeldwan michaeldwan requested a review from a team as a code owner April 9, 2026 23:31
Copy link
Copy Markdown

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - the fix correctly ensures async setup() and predict() share the same event loop by passing the shared loop to setup and using run_coroutine_threadsafe. The test verifies event-loop-bound resources created in setup work correctly in predict.

@ask-bonk
Copy link
Copy Markdown

ask-bonk bot commented Apr 9, 2026

I reviewed PR #2927 which fixes a bug where async setup() and predict() ran on different event loops. I read all three changed files (predictor.rs, worker_bridge.rs, test_coglet.py), verified the fix correctly passes the shared event loop to setup for async coroutines, and confirmed the test validates event-loop-bound resources work correctly across both phases. I initially flagged a potential cancellation handling issue but realized setup can't be cancelled via the slot mechanism, then submitted an LGTM review via gh api.

github run

@michaeldwan michaeldwan added this pull request to the merge queue Apr 10, 2026
Merged via the queue into main with commit 61a04a2 Apr 10, 2026
39 checks passed
@michaeldwan michaeldwan deleted the md/fix-async-event-loops branch April 10, 2026 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Async predictors crash on 0.17 -- setup() and predict() run on different event loops

2 participants