generated from MetaMask/metamask-module-template
-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Problem
isRetryableNetworkError treats all network errors as retryable, but we can't distinguish "not running now" from "will never be running again" from "wrong address". This causes wasted retries for permanently unreachable peers.
Expected Behavior
- Track error patterns per peer over time (error codes, frequency, success rate)
- Classify persistent failures as permanently non-retryable after threshold
- Stop retrying when pattern indicates permanent failure (wrong address, dead peer)
- Continue retrying for transient failures (temporary network issues)
Implementation
Files to Modify
| File | Changes |
|---|---|
platform/reconnection.ts |
Add error history tracking to ReconnectionManager |
platform/reconnection-lifecycle.ts |
Check for permanent failure before attempting reconnection |
@metamask/kernel-errors |
Update isRetryableNetworkError or add isPermanentlyFailed check |
Approach
-
Add error tracking to
ReconnectionManager(platform/reconnection.ts)- Track error history per peer (error codes, timestamps)
- Track consecutive identical errors
- Track success rate over time window
-
Implement heuristics for permanent failure detection
- Same error code N times consecutively without success = permanent
- Specific error patterns: persistent
ECONNREFUSED,EHOSTUNREACH, DNS failures - Configurable thresholds
-
Add permanent failure state
- New state in
ReconnectionManager:isPermanentlyFailed(peerId) - Clear permanent failure on explicit reconnect request
- New state in
-
Integrate with reconnection lifecycle (
platform/reconnection-lifecycle.ts)- Check
isPermanentlyFailedbefore attempting reconnection - Call
onRemoteGiveUpwhen permanent failure detected
- Check
Acceptance Criteria
- Error patterns tracked per peer in
ReconnectionManager - Persistent failures classified as permanent after threshold
- Permanent failures stop retry attempts
- Transient failures continue to retry normally
- Permanent failure state can be cleared for manual reconnection
- Unit tests verify pattern detection and permanent failure classification
- E2E test for permanent failure scenario
Metadata
Metadata
Assignees
Labels
No labels