Use event instead of termination callback#40767
Conversation
There was a problem hiding this comment.
Pull request overview
This PR reworks how WSLC session termination is exposed to consumers by replacing the session-creation-time push callback (ITerminationCallback) with a pull model: a one-off termination event plus a cached termination reason that can be queried by any holder of an IWSLCSession (including when opening existing sessions).
Changes:
- Adds termination event + termination reason APIs to the runtime/session and service VM layers, and updates session state tracking to use the terminated event.
- Updates the public SDK surface to expose
WslcGetSessionTerminationEvent/WslcGetSessionTerminationReason, and removes termination-callback plumbing (code, exports, and MSI proxy/stub registration). - Updates WSLC runtime + SDK tests to validate event signaling, handle lifetime, and reason availability semantics.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/windows/WSLCTests.cpp | Replaces callback-based termination test with event + reason polling/verification. |
| test/windows/WslcSdkTests.cpp | Updates SDK tests to wait on the returned termination event and query termination reason. |
| src/windows/wslcsession/WSLCVirtualMachine.h | Adds a helper to fetch cached termination reason/details from the underlying VM. |
| src/windows/wslcsession/WSLCSession.h | Adds terminated event + cached termination reason/details; removes m_terminated. |
| src/windows/wslcsession/WSLCSession.cpp | Implements GetTerminationEvent/GetTerminationReason, populates cached reason during teardown, and signals terminated event at completion. |
| src/windows/WslcSDK/WslcsdkPrivate.h | Shrinks internal session options; removes stored termination callback COM pointer. |
| src/windows/WslcSDK/wslcsdk.h | Updates public SDK API: removes termination callback API, adds termination event + reason APIs, adjusts opaque options size. |
| src/windows/WslcSDK/wslcsdk.def | Updates exports to remove callback setter and add termination event/reason getters. |
| src/windows/WslcSDK/wslcsdk.cpp | Removes callback wiring during session creation; implements WslcGetSessionTerminationEvent and WslcGetSessionTerminationReason. |
| src/windows/WslcSDK/TerminationCallback.h | Removed (callback mechanism deleted). |
| src/windows/WslcSDK/TerminationCallback.cpp | Removed (callback mechanism deleted). |
| src/windows/WslcSDK/CMakeLists.txt | Removes TerminationCallback sources/headers from the SDK build. |
| src/windows/service/inc/wslc.idl | Removes ITerminationCallback and adds new termination event/reason methods to session + VM interfaces. |
| src/windows/service/exe/HcsVirtualMachine.h | Adds cached termination reason/details members and declares GetTerminationReason. |
| src/windows/service/exe/HcsVirtualMachine.cpp | Caches termination reason/details on exit before signaling the exit event; implements GetTerminationReason. |
| msipackage/package.wix.in | Removes COM registration for ITerminationCallback interface. |
OneBlue
left a comment
There was a problem hiding this comment.
Looks like there are some build errors, but overall design LGTM
| // Stop watching the termination handle and drain any in-flight callback before the rest of the | ||
| // session is torn down. Releasing the session handle below can signal the termination event, so the | ||
| // wait must be cancelled first to avoid the callback running against a half-destructed object. | ||
| if (m_terminationWait) |
There was a problem hiding this comment.
I don't think we explicitly need to do this since m_terminationWait will do that for us and it always gets destroyed before the session gets released
| *Event = nullptr; | ||
|
|
||
| // Duplicate the "terminated" event. The caller owns the returned handle, which stays valid even after the session is released. | ||
| *Event = wsl::windows::common::wslutil::DuplicateHandle(m_sessionTerminatedEvent.get()); |
There was a problem hiding this comment.
I recommend limiting the handle's permission to "SYNCHRONIZE" here, so a rogue caller can't signal the termination event
| // Returns the cached termination reason and details. The values are only meaningful | ||
| // after the termination event has been signaled; before that the reason is | ||
| // WSLCVirtualMachineTerminationReasonUnknown and Details is an empty string. | ||
| HRESULT GetTerminationReason([out] WSLCVirtualMachineTerminationReason* Reason, [out] LPWSTR* Details); |
There was a problem hiding this comment.
Long term I think we should parse the Details into fields so the caller doesn't have to deal with HCS's json format.
But I think this is fine for now, since this is already a major improvement on the previous design
| // Returns a one-off event that is signaled when the session terminates, whether due to an | ||
| // explicit Terminate() call or an unexpected VM exit. The returned handle is owned by the | ||
| // caller and remains valid (and observes the signaled state) even after the session is released. | ||
| HRESULT GetTerminationEvent([out, system_handle(sh_event)] HANDLE* Event); | ||
|
|
| // Returns the cached termination reason and details. The values are only meaningful after the | ||
| // termination event has been signaled; before that the reason is | ||
| // WSLCVirtualMachineTerminationReasonUnknown and Details is an empty string. |
| ULONG MemoryMb; | ||
| ULONG BootTimeoutMs; | ||
| WSLCNetworkingMode NetworkingMode; | ||
| [unique] ITerminationCallback* TerminationCallback; | ||
| WSLCFeatureFlags FeatureFlags; | ||
| WSLCHandle DmesgOutput; | ||
| WSLCSessionStorageFlags StorageFlags; |
PR #40767 (terminateEvent) removed ITerminationCallback / WSLCSessionSettings.TerminationCallback from the IDL but left the WSLCVirtualMachineFactory (introduced later by factory PR #40770, which its branch merged from master) still referencing them, so the branch did not compile. Drop the dead member and assignments; the VM now caches the reason for WSLCSession to pull. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Kevin's PR #40767 (event-model termination) moved m_vmExitEvent.SetEvent() in OnExit() to after a new std::lock_guard(m_lock) that caches the termination reason. ~HcsVirtualMachine already holds m_lock across the 5s exit-event wait and HcsCloseComputeSystem (which drains in-flight HCS callbacks). An in-flight OnExit() therefore blocks acquiring m_lock, so it never signals the exit event nor drains, and the close never completes: a hard deadlock that StuckVmTermination reliably reproduces. Drop the broad lock from the dtor. By the time the compute system is closed no further callbacks can run, so the remaining teardown is safe unguarded. Flag to Kevin to fold into #40767. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary of the Pull Request
Reworks how WSLC session termination is surfaced to clients. The previous design used a push-style
ITerminationCallbackthat could only be subscribed at session-creation time (viaWSLCSessionSettings), so it was unavailable when opening an existing session. This replaces it with a pull model that mirrors the process-exit-event pattern: a one-off "terminated" event plus a cached termination reason that any holder of anIWSLCSessioncan query — whether the session was created or opened.PR Checklist
Detailed Description of the Pull Request / Additional comments
Two distinct lifecycle events on the session:
m_sessionTerminatingEvent(existing) — signaled at the start of teardown to cancel in-flight operations.m_sessionTerminatedEvent(new) — signaled once teardown is complete ("the session is done"), via either an explicitTerminate()or an unexpected VM exit. This is the event handed to clients.Runtime (service / per-user session):
IWSLCSessiongainsGetTerminationEvent(returns a duplicated, caller-owned handle to the terminated event that stays valid after the session is released) andGetTerminationReason(returns the cached reason + details; fails withERROR_INVALID_STATEuntil the session has actually terminated, so callers can distinguish "still running" from "terminated for an unknown reason").HcsVirtualMachine(service) computes the termination reason from the HCS exit status inOnExitand caches it;IWSLCVirtualMachinegainsGetTerminationReason.WSLCSession::Terminate()captures the reason once: it defaults toShutdownfor an explicit terminate (VM still alive = we shut it down), and pulls the real crash/shutdown reason from the VM when the VM exited on its own. Capture happens under the exclusive lock before the VM reference is released, and the terminated event is signaled last so any observer sees a fully torn-down session.m_terminatedbool removed —m_sessionTerminatedEvent.is_signaled()is now the single source of truth forGetState/GetTerminationReason.SDK (
wslcsdk):WslcSetSessionSettingsTerminationCallbackand theITerminationCallback/TerminationCallbackplumbing (files, CMake,.def, WiX proxy/stub registration).WslcGetSessionTerminationEventandWslcGetSessionTerminationReason.WslcSessionOptionsInternalshrank (removed callback + context fields); opaqueWSLC_SESSION_OPTIONS_SIZEupdated accordingly.IDL: removed the
ITerminationCallbackinterface and theTerminationCallbackfield fromWSLCSessionSettings; added the new methods toIWSLCSessionandIWSLCVirtualMachine.Validation Steps Performed
WSLCTests.cpp::TerminationEventwaits onIWSLCSession::GetTerminationEvent, verifiesGetTerminationReasonreturnsERROR_INVALID_STATEbefore termination andShutdownafter an explicitTerminate().WslcSdkTests.cpp::TerminationEventViaTerminateandTerminationEventViaReleasewait on the SDK-returned event (including verifying the handle stays valid after the session is released) and check the reason viaWslcGetSessionTerminationReason.