Skip to content

Use event instead of termination callback#40767

Open
kvega005 wants to merge 10 commits into
microsoft:masterfrom
kvega005:terminateEvent
Open

Use event instead of termination callback#40767
kvega005 wants to merge 10 commits into
microsoft:masterfrom
kvega005:terminateEvent

Conversation

@kvega005

Copy link
Copy Markdown
Contributor

Summary of the Pull Request

Reworks how WSLC session termination is surfaced to clients. The previous design used a push-style ITerminationCallback that could only be subscribed at session-creation time (via WSLCSessionSettings), so it was unavailable when opening an existing session. This replaces it with a pull model that mirrors the process-exit-event pattern: a one-off "terminated" event plus a cached termination reason that any holder of an IWSLCSession can query — whether the session was created or opened.

PR Checklist

  • Closes: Link to issue #xxx
  • Communication: I've discussed this with core contributors already.
  • Tests: Added/updated if needed and all pass
  • Localization: All end user facing strings can be localized (no new user-facing strings)
  • Dev docs: Added/updated if needed
  • Documentation updated:

Detailed Description of the Pull Request / Additional comments

Two distinct lifecycle events on the session:

  • m_sessionTerminatingEvent (existing) — signaled at the start of teardown to cancel in-flight operations.
  • m_sessionTerminatedEvent (new) — signaled once teardown is complete ("the session is done"), via either an explicit Terminate() or an unexpected VM exit. This is the event handed to clients.

Runtime (service / per-user session):

  • IWSLCSession gains GetTerminationEvent (returns a duplicated, caller-owned handle to the terminated event that stays valid after the session is released) and GetTerminationReason (returns the cached reason + details; fails with ERROR_INVALID_STATE until the session has actually terminated, so callers can distinguish "still running" from "terminated for an unknown reason").
  • HcsVirtualMachine (service) computes the termination reason from the HCS exit status in OnExit and caches it; IWSLCVirtualMachine gains GetTerminationReason.
  • WSLCSession::Terminate() captures the reason once: it defaults to Shutdown for an explicit terminate (VM still alive = we shut it down), and pulls the real crash/shutdown reason from the VM when the VM exited on its own. Capture happens under the exclusive lock before the VM reference is released, and the terminated event is signaled last so any observer sees a fully torn-down session.
  • m_terminated bool removed — m_sessionTerminatedEvent.is_signaled() is now the single source of truth for GetState/GetTerminationReason.

SDK (wslcsdk):

  • Removed WslcSetSessionSettingsTerminationCallback and the ITerminationCallback/TerminationCallback plumbing (files, CMake, .def, WiX proxy/stub registration).
  • Added WslcGetSessionTerminationEvent and WslcGetSessionTerminationReason.
  • WslcSessionOptionsInternal shrank (removed callback + context fields); opaque WSLC_SESSION_OPTIONS_SIZE updated accordingly.

IDL: removed the ITerminationCallback interface and the TerminationCallback field from WSLCSessionSettings; added the new methods to IWSLCSession and IWSLCVirtualMachine.

Validation Steps Performed

  • Updated runtime tests: WSLCTests.cpp::TerminationEvent waits on IWSLCSession::GetTerminationEvent, verifies GetTerminationReason returns ERROR_INVALID_STATE before termination and Shutdown after an explicit Terminate().
  • Updated SDK tests: WslcSdkTests.cpp::TerminationEventViaTerminate and TerminationEventViaRelease wait on the SDK-returned event (including verifying the handle stays valid after the session is released) and check the reason via WslcGetSessionTerminationReason.

Copilot AI review requested due to automatic review settings June 10, 2026 23:00

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reworks how WSLC session termination is exposed to consumers by replacing the session-creation-time push callback (ITerminationCallback) with a pull model: a one-off termination event plus a cached termination reason that can be queried by any holder of an IWSLCSession (including when opening existing sessions).

Changes:

  • Adds termination event + termination reason APIs to the runtime/session and service VM layers, and updates session state tracking to use the terminated event.
  • Updates the public SDK surface to expose WslcGetSessionTerminationEvent / WslcGetSessionTerminationReason, and removes termination-callback plumbing (code, exports, and MSI proxy/stub registration).
  • Updates WSLC runtime + SDK tests to validate event signaling, handle lifetime, and reason availability semantics.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/windows/WSLCTests.cpp Replaces callback-based termination test with event + reason polling/verification.
test/windows/WslcSdkTests.cpp Updates SDK tests to wait on the returned termination event and query termination reason.
src/windows/wslcsession/WSLCVirtualMachine.h Adds a helper to fetch cached termination reason/details from the underlying VM.
src/windows/wslcsession/WSLCSession.h Adds terminated event + cached termination reason/details; removes m_terminated.
src/windows/wslcsession/WSLCSession.cpp Implements GetTerminationEvent/GetTerminationReason, populates cached reason during teardown, and signals terminated event at completion.
src/windows/WslcSDK/WslcsdkPrivate.h Shrinks internal session options; removes stored termination callback COM pointer.
src/windows/WslcSDK/wslcsdk.h Updates public SDK API: removes termination callback API, adds termination event + reason APIs, adjusts opaque options size.
src/windows/WslcSDK/wslcsdk.def Updates exports to remove callback setter and add termination event/reason getters.
src/windows/WslcSDK/wslcsdk.cpp Removes callback wiring during session creation; implements WslcGetSessionTerminationEvent and WslcGetSessionTerminationReason.
src/windows/WslcSDK/TerminationCallback.h Removed (callback mechanism deleted).
src/windows/WslcSDK/TerminationCallback.cpp Removed (callback mechanism deleted).
src/windows/WslcSDK/CMakeLists.txt Removes TerminationCallback sources/headers from the SDK build.
src/windows/service/inc/wslc.idl Removes ITerminationCallback and adds new termination event/reason methods to session + VM interfaces.
src/windows/service/exe/HcsVirtualMachine.h Adds cached termination reason/details members and declares GetTerminationReason.
src/windows/service/exe/HcsVirtualMachine.cpp Caches termination reason/details on exit before signaling the exit event; implements GetTerminationReason.
msipackage/package.wix.in Removes COM registration for ITerminationCallback interface.

Comment thread src/windows/wslcsession/WSLCSession.cpp
Comment thread src/windows/wslcsession/WSLCSession.cpp
Comment thread src/windows/service/inc/wslc.idl
Comment thread src/windows/service/inc/wslc.idl
Copilot AI review requested due to automatic review settings June 10, 2026 23:43

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Comment thread src/windows/service/inc/wslc.idl
Copilot AI review requested due to automatic review settings June 11, 2026 18:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Comment thread src/windows/service/inc/wslc.idl
Comment thread src/windows/service/inc/wslc.idl
Comment thread src/windows/service/inc/wslc.idl
@kvega005 kvega005 marked this pull request as ready for review June 11, 2026 21:02
@kvega005 kvega005 requested a review from a team as a code owner June 11, 2026 21:02

@OneBlue OneBlue left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are some build errors, but overall design LGTM

Comment thread src/windows/WslcSDK/winrt/Session.cpp Outdated
// Stop watching the termination handle and drain any in-flight callback before the rest of the
// session is torn down. Releasing the session handle below can signal the termination event, so the
// wait must be cancelled first to avoid the callback running against a half-destructed object.
if (m_terminationWait)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we explicitly need to do this since m_terminationWait will do that for us and it always gets destroyed before the session gets released

Comment thread src/windows/wslcsession/WSLCSession.cpp Outdated
*Event = nullptr;

// Duplicate the "terminated" event. The caller owns the returned handle, which stays valid even after the session is released.
*Event = wsl::windows::common::wslutil::DuplicateHandle(m_sessionTerminatedEvent.get());

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend limiting the handle's permission to "SYNCHRONIZE" here, so a rogue caller can't signal the termination event

// Returns the cached termination reason and details. The values are only meaningful
// after the termination event has been signaled; before that the reason is
// WSLCVirtualMachineTerminationReasonUnknown and Details is an empty string.
HRESULT GetTerminationReason([out] WSLCVirtualMachineTerminationReason* Reason, [out] LPWSTR* Details);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term I think we should parse the Details into fields so the caller doesn't have to deal with HCS's json format.

But I think this is fine for now, since this is already a major improvement on the previous design

Copilot AI review requested due to automatic review settings June 11, 2026 23:05

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Comment on lines +785 to +789
// Returns a one-off event that is signaled when the session terminates, whether due to an
// explicit Terminate() call or an unexpected VM exit. The returned handle is owned by the
// caller and remains valid (and observes the signaled state) even after the session is released.
HRESULT GetTerminationEvent([out, system_handle(sh_event)] HANDLE* Event);

Comment on lines +790 to +792
// Returns the cached termination reason and details. The values are only meaningful after the
// termination event has been signaled; before that the reason is
// WSLCVirtualMachineTerminationReasonUnknown and Details is an empty string.
Comment on lines 571 to 576
ULONG MemoryMb;
ULONG BootTimeoutMs;
WSLCNetworkingMode NetworkingMode;
[unique] ITerminationCallback* TerminationCallback;
WSLCFeatureFlags FeatureFlags;
WSLCHandle DmesgOutput;
WSLCSessionStorageFlags StorageFlags;
benhillis pushed a commit that referenced this pull request Jun 12, 2026
PR #40767 (terminateEvent) removed ITerminationCallback / WSLCSessionSettings.TerminationCallback from the IDL but left the WSLCVirtualMachineFactory (introduced later by factory PR #40770, which its branch merged from master) still referencing them, so the branch did not compile. Drop the dead member and assignments; the VM now caches the reason for WSLCSession to pull.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
benhillis pushed a commit that referenced this pull request Jun 12, 2026
Kevin's PR #40767 (event-model termination) moved m_vmExitEvent.SetEvent()
in OnExit() to after a new std::lock_guard(m_lock) that caches the
termination reason. ~HcsVirtualMachine already holds m_lock across the 5s
exit-event wait and HcsCloseComputeSystem (which drains in-flight HCS
callbacks). An in-flight OnExit() therefore blocks acquiring m_lock, so it
never signals the exit event nor drains, and the close never completes:
a hard deadlock that StuckVmTermination reliably reproduces.

Drop the broad lock from the dtor. By the time the compute system is
closed no further callbacks can run, so the remaining teardown is safe
unguarded. Flag to Kevin to fold into #40767.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants