Skip to content

Event-Driven Architecture Refactoring#227

Open
teith wants to merge 14 commits intomainfrom
event-driven
Open

Event-Driven Architecture Refactoring#227
teith wants to merge 14 commits intomainfrom
event-driven

Conversation

@teith
Copy link
Contributor

@teith teith commented Jan 20, 2026

Problem: SDK fails with 50+ concurrent requests due to O(N) listener dispatch, polling loops starving event loop, and false disconnects from missed heartbeat pongs.

Solution:

  • Replace polling with asyncio.Future for O(1) message routing via _pending_operations HashMap
  • Add semaphore to limit concurrent requests (default 15, env RUNWARE_MAX_CONCURRENT_REQUESTS )
  • Add jitter on reconnect to prevent thundering herd
  • Increase PING_TIMEOUT_DURATION from 10s to 30s for stability
  • Graceful cancellation of pending operations on disconnect

Result: Stable operation at high concurrency, no more false "Connection lost" errors.

@teith teith requested a review from Sirsho1997 January 20, 2026 04:28
@teith teith requested a review from Sirsho1997 January 21, 2026 23:32
teith and others added 2 commits January 22, 2026 03:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the SDK’s WebSocket message handling toward a more event-driven model by routing responses to per-request asyncio.Futures, aiming to improve high-concurrency stability and reduce polling/listener overhead.

Changes:

  • Introduces _pending_operations with an OperationState state machine to route inbound messages to waiting Futures.
  • Adds a global request semaphore (MAX_CONCURRENT_REQUESTS) and reconnect jitter/backoff adjustments.
  • Tweaks connection stability settings (e.g., increased ping/pong timeout) and adds disconnect-time cancellation of pending operations.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 8 comments.

File Description
runware/utils.py Increases ping timeout; adds MAX_CONCURRENT_REQUESTS env-configured constant.
runware/types.py Adds OperationState and request wrapper dataclasses used by retry/routing logic.
runware/server.py Routes messages to pending-operation futures and cancels pending operations on close.
runware/base.py Implements pending-operation registry, retry/reconnect behavior updates, and semaphore-based concurrency limiting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Sirsho1997 Sirsho1997 requested a review from Copilot February 24, 2026 03:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@Sirsho1997 Sirsho1997 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I made some additional changes. Please review these

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants