Skip to content

Remote comms: Kernel incarnation detection #689

@sirtimid

Description

@sirtimid

Problem

When a peer reconnects with the same peer ID but has lost its state (e.g., external peer ID config + DB loss), we can't detect this. Promises waiting for resolution from that peer will hang forever since the remote no longer knows about them.

Solution

Add an incarnation ID (UUID) generated at kernel start that is exchanged via handshake. If incarnation changes on reconnection, reject promises from the old incarnation.

Key insight: Kernel state normally persists across restarts. Incarnation detection is for the edge case where peer ID is preserved but state is lost.

Implementation Plan

1. Add Handshake Message Type

File: packages/ocap-kernel/src/remotes/kernel/RemoteHandle.ts

Extend RemoteMessageBase to include:

  • { method: 'handshake'; params: { incarnationId: string } }
  • { method: 'handshakeAck'; params: { incarnationId: string } }

2. Generate Incarnation ID in Kernel

File: packages/ocap-kernel/src/Kernel.ts

  • Generate incarnationId = crypto.randomUUID() in constructor
  • Store in memory only (intentionally NOT persisted)
  • Pass to RemoteManager / transport layer

3. Extend PeerStateManager for Incarnation Tracking

File: packages/ocap-kernel/src/remotes/platform/peer-state-manager.ts

Add remoteIncarnationId field to peer state and methods:

  • setRemoteIncarnation(peerId, id) - returns true if incarnation changed
  • getRemoteIncarnation(peerId) - getter

4. Handle Handshake in Transport

File: packages/ocap-kernel/src/remotes/platform/transport.ts

  • Add localIncarnationId parameter to initTransport()
  • Handle incoming handshake → store incarnation, reply with handshakeAck
  • Handle incoming handshakeAck → store incarnation
  • On incarnation change → trigger promise rejection via callback
  • Send handshake when connection established (outbound initiates)

5. Wire Up Incarnation ID

  • kernel/remote-comms.ts - Pass incarnation ID from Kernel to transport layer
  • kernel/RemoteManager.ts - Reject kernel promises via existing #handleRemoteGiveUp pattern

Files to Modify

File Changes
Kernel.ts Generate incarnation ID
platform/peer-state-manager.ts Add incarnation tracking to peer state
platform/transport.ts Handshake protocol, incarnation change detection
kernel/remote-comms.ts Pass incarnation ID
kernel/RemoteHandle.ts Add handshake message types
kernel/RemoteManager.ts Reject kernel promises on incarnation change

Edge Cases

  1. Both sides reconnecting - Outbound initiates handshake, inbound responds
  2. Messages before handshake - Queue until handshake completes
  3. First connection - No previous incarnation, just store (no rejection)
  4. Same incarnation - No rejection, normal operation
  5. Handshake timeout - Treat as connection failure, trigger reconnection

Acceptance Criteria

  • Incarnation ID generated at kernel start
  • Handshake exchanged on connection establishment
  • Incarnation changes detected on reconnection
  • Pending messages rejected on incarnation change
  • Kernel promises rejected via #handleRemoteGiveUp pattern
  • New connections work normally after incarnation change
  • Unit tests for incarnation tracking and handshake
  • E2E test for incarnation change detection

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions