Skip to content

Out-of-process plugin host via threaded callback pump (alternative to #40120)#40769

Draft
benhillis wants to merge 3 commits into
masterfrom
user/benhill/plugin_host_threaded
Draft

Out-of-process plugin host via threaded callback pump (alternative to #40120)#40769
benhillis wants to merge 3 commits into
masterfrom
user/benhill/plugin_host_threaded

Conversation

@benhillis

Copy link
Copy Markdown
Member

Draft / alternative design. This is an alternative to #40120 for evaluation, built on the same out-of-process plugin host work. It replaces that PR's m_callbackLock synchronization with a threaded callback pump (per @OneBlue's review suggestion). Opened as a draft to compare the two approaches.

Summary

WSL plugin DLLs are moved out of wslservice.exe into a separate wslpluginhost.exe COM server (same as #40120). The difference is how plugin callback re-entrancy is handled while the service is blocked in an outbound notification:

  • Isolate plugins in an out-of-process COM host #40120 adds a second lock (std::shared_mutex m_callbackLock) plus a session-ref registry, with a dual-lock discipline (m_instanceLock + m_callbackLock) spread across callback-reachable session state.
  • This PR runs each outbound notification (host->On...) on a worker thread and pumps the plugin's service-side API calls back onto the original notifying thread, which already holds the session's recursive m_instanceLock. This reproduces the in-process re-entrancy model with a single lock — no m_callbackLock, no new dual-lock annotations.

Trade-off: the pump is ~+195 net code lines (a new PluginCallPump primitive + per-notification wiring) but keeps the single-lock model the rest of the session already assumes.

PR Checklist

Detailed Description

PluginCallPump (src/windows/service/exe/PluginCallPump.{h,cpp})

Run(notification) spawns a worker thread that makes the outbound COM call; the calling (notifying) thread pumps queued Invoke(work) calls and runs them under m_instanceLock, so recursive locks re-enter exactly as in-process. Invoke reports ran-vs-stopped out-of-band (no HRESULT sentinel), is exception-safe, and cannot strand an RPC thread.

Routing (PluginManager.cpp)

  • WSL notifications (OnVm*, OnDistribution*) route via RunHostNotification, which registers a per-session pump for the duration of the call.
  • WSL callbacks (MountFolder, ExecuteBinary[InDistribution]) route via InvokeOnWslPump: pump when a hook is in flight, direct (own m_instanceLock) otherwise — preserving async out-of-hook callbacks.

Deadlock fix

A callback racing the pre-notification window (lock held, pump not yet registered) must not block on m_instanceLock. InvokeOnWslPump uses a timed acquire (TryInvokeUnderInstanceLock) and re-checks for a pump, routing the racing callback onto the pump instead of dead-blocking.

Scope note

The WSLC session-ref registry is intentionally left as-is (its lock is non-recursive, notifications already fire outside it, and it is entangled with the create-veto protocol). Documented as a follow-up.

Validation Steps

cmake --build . -- -m
bin\x64\Debug\test.bat /name:PluginTests::*

Result: Total=31, Passed=30, Failed=0, Skipped=1 (skip = signed-build-only signature test). Includes the previously-deadlocking CallbacksDuringTerminationDoNotCrash.

Copilot AI review requested due to automatic review settings June 11, 2026 05:03

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements an alternative callback re-entrancy strategy for the out-of-process WSL plugin host design: instead of adding a second callback lock, it introduces a threaded notification + callback “pump” so plugin callbacks are executed back on the original notifying thread (preserving the existing single m_instanceLock re-entrancy model).

Changes:

  • Add a new out-of-process COM local server wslpluginhost.exe implementing IWslPluginHost and forwarding plugin API callbacks back to the service via IWslPluginHostCallback.
  • Refactor PluginManager to activate per-plugin hosts via COM (GIT marshaling), run outbound notifications through PluginCallPump, and route inbound callbacks via the pump or a timed direct-lock path to avoid deadlocks.
  • Expand/adjust Windows tests and installer/packaging to include wslpluginhost.exe and validate host-crash isolation + concurrency/callback scenarios.

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test/windows/testplugin/Plugin.cpp Extends test plugin to exercise host-crash, concurrent callbacks, async worker callbacks, and teardown races.
test/windows/PluginTests.h Adds new PluginTestType cases for out-of-proc host scenarios.
test/windows/PluginTests.cpp Adds/updates tests to validate host crash behavior, concurrent callbacks, async worker callbacks, and teardown-race stability.
test/windows/InstallerTests.cpp Ensures installed executable set includes wslpluginhost.exe.
test/windows/Common.h Adds wslpluginhost.exe to test “BinaryUnderTest” list; declares service PID helpers.
test/windows/Common.cpp Implements GetWslServicePid() and GetWslServiceRunningPid() to validate service survival without silent restarts.
src/windows/wslpluginhost/exe/resource.h Adds resource declarations for the new plugin host executable.
src/windows/wslpluginhost/exe/PluginHost.h Declares the COM host class that loads plugin DLLs and forwards callbacks to the service.
src/windows/wslpluginhost/exe/PluginHost.cpp Implements plugin host initialization, hook dispatch, and service-callback stubs (including COM init on worker threads).
src/windows/wslpluginhost/exe/main.rc Adds version/icon resources for wslpluginhost.exe.
src/windows/wslpluginhost/exe/main.cpp Implements COM local server lifecycle (factory registration, mitigation policies, startup timeout).
src/windows/wslpluginhost/exe/CMakeLists.txt Builds the new wslpluginhost target and links required libraries/headers.
src/windows/wslpluginhost/CMakeLists.txt Adds plugin host subdirectory to the build.
src/windows/wslinstall/DllMain.cpp Registers wslpluginhost.exe in LSP categories list.
src/windows/service/stub/CMakeLists.txt Adds generated proxy/stub sources for WslPluginHost.idl into wslserviceproxystub.dll.
src/windows/service/inc/WslPluginHost.idl Introduces COM interfaces IWslPluginHost and IWslPluginHostCallback plus CLSID.
src/windows/service/inc/CMakeLists.txt Adds wslpluginhostidl generation.
src/windows/service/exe/WSLCSessionManager.h Reworks session iteration helpers to avoid holding session lock across out-of-proc plugin notifications.
src/windows/service/exe/WSLCSessionManager.cpp Moves stopping notifications out from under session lock; adds rollback/race handling around OnWslcSessionCreated.
src/windows/service/exe/ServiceMain.cpp Calls g_pluginManager.Shutdown() before CoUninitialize to avoid proxy teardown after COM shutdown.
src/windows/service/exe/PluginManager.h Major refactor for out-of-proc hosts, GIT proxying, job object ownership, WSLC session ref-map, and pump routing.
src/windows/service/exe/PluginManager.cpp Implements out-of-proc activation, host-crash latching, pump-driven notifications, and callback routing logic.
src/windows/service/exe/PluginCallPump.h Adds the pump primitive for running notifications on a worker while executing callbacks on the notifying thread.
src/windows/service/exe/PluginCallPump.cpp Implements pump queueing, worker coordination, and safe shutdown semantics.
src/windows/service/exe/LxssUserSession.h Adds TryInvokeUnderInstanceLock to support timed direct-callback execution without deadlocking.
src/windows/service/exe/LxssUserSession.cpp Implements TryInvokeUnderInstanceLock.
src/windows/service/exe/CMakeLists.txt Adds PluginCallPump and wslpluginhostidl dependency to wslservice.
src/windows/common/precomp.h Adds <shared_mutex> include to shared PCH.
msipackage/package.wix.in Installs wslpluginhost.exe and registers its COM/AppID/Interface entries.
msipackage/CMakeLists.txt Adds wslpluginhost.exe to MSI packaging inputs and dependencies.
CMakeLists.txt Adds src/windows/wslpluginhost to the top-level build.
.pipelines/build-stage.yml Adds wslpluginhost to CI build targets/artifact patterns.

Comment thread src/windows/service/exe/PluginManager.cpp Outdated
@benhillis benhillis force-pushed the user/benhill/plugin_host_threaded branch from deedac2 to 67d56d6 Compare June 11, 2026 05:15
WSL plugin DLLs are moved out of wslservice.exe into a separate
wslpluginhost.exe COM server so plugin code can no longer crash or
destabilize the service. Each plugin is activated in its own host
process (CLSCTX_LOCAL_SERVER, SYSTEM-only via AppID) and reached
through a versioned COM interface defined in WslPluginHost.idl. All
hosts are tied to a service-owned job object and terminate when
wslservice exits. The plugin API is unchanged; existing plugins run
unmodified.

A crashing or disconnected host is classified by IsHostCrash
(RPC_E_DISCONNECTED, RPC_E_SERVER_DIED[_DNE], CO_E_OBJNOTCONNECTED,
RPC_S_SERVER_UNAVAILABLE, RPC_S_CALL_FAILED[_DNE]); the service logs
it and continues instead of treating it as a fatal plugin error.
RPC_E_CALL_REJECTED is intentionally excluded as a transient busy
state rather than a dead host.

Plugin->service callbacks (MountFolder, ExecuteBinary, and the WSLC
session APIs) arrive on a different COM thread than the outbound hook,
so they cannot re-enter the lock held during the hook:
- VM path: LxssUserSessionImpl guards callbacks with a shared_mutex
  (shared for callbacks, exclusive in _VmTerminate after OnVmStopping
  drains in-flight callbacks before the utility VM is destroyed).
- WSLC path: PluginManager resolves sessions through its own
  reference map under a dedicated lock, and WSLCSessionManager
  releases its session lock before any plugin notification fires, so
  callbacks never re-enter the session lock. A session is registered
  in the reference map but not published until OnWslcSessionCreated
  succeeds, so a vetoed or race-lost session is never handed out.

Proxy/stub is consolidated into wslserviceproxystub.dll. One new exe,
no new DLLs.

Tests
- HostCrashIsolation: kills wslpluginhost.exe mid-OnVmStarted and
  verifies the service survives and m_initOnce stays sticky.
- ConcurrentCallbacks: four plugin threads hammer MountFolder and
  ExecuteBinary, exercising the shared callback lock.
- AsyncApiCallFromWorker: a plugin worker thread calls into the
  service post-hook (cross-apartment, non-COM-initialized).
- CallbacksDuringTerminationDoNotCrash: worker threads race
  _VmTerminate's exclusive lock and VM teardown, then wind down
  deterministically after OnVmStopping signals them and are joined on
  the next session start.
- Existing WSL1 plugin tests broadened alongside the refactor.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 11, 2026 06:28
@benhillis benhillis force-pushed the user/benhill/plugin_host_threaded branch from 67d56d6 to 22ab719 Compare June 11, 2026 06:28

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 2 comments.

Comment thread src/windows/service/exe/WSLCSessionManager.h
Comment thread src/windows/service/exe/PluginManager.cpp
Ben Hillis and others added 2 commits June 10, 2026 23:44
- Pass the kill-on-close job object into IWslPluginHost::Initialize and drop
  GetProcessHandle so the host assigns itself to the job before running any
  plugin code.
- Return an IWSLCProcess reference from WslcCreateProcess instead of opaque
  process cookies; remove the cookie bookkeeping methods.
- Make plugin host crashes fatal: a crash during a start/veto hook (or at
  load) blocks the WSL operation with a recorded, plugin-named error, matching
  pre-refactor behavior. Latch the first crash so subsequent operations fail
  fast with one consistent error instead of repeatedly driving a dead host
  (m_pluginError guarded by m_pluginErrorLock). Teardown hooks latch but stay
  non-fatal. Re-activation/non-fatal resilience is a follow-up.
- Drop the sidData empty-guards so WSL and WSLC hooks pass the serialized SID
  consistently.
- Rework the HostCrash test to assert the new fatal behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rocess plugin host

Alternative to the m_callbackLock design in PR #40120, per @OneBlue's review.

Instead of guarding plugin callbacks with a new shared_mutex (m_callbackLock) and
a session-ref registry, this runs each outbound notification (host->On...) on a
worker thread and pumps the plugin's service-side API calls back onto the original
notifying thread, which already holds the session's recursive m_instanceLock. This
reproduces in-process re-entrancy and lets us delete the synchronization the PR added.

- New PluginCallPump primitive (Run/Invoke): the worker makes the COM notification
  call; the notifying thread pumps queued callback work and runs it under m_instanceLock.
- PluginManager: WSL notifications (OnVm*/OnDistribution*) route via RunHostNotification;
  WSL callbacks (MountFolder/ExecuteBinary[InDistribution]) route via a hybrid
  InvokeOnWslPump - pump when a hook is in flight, direct (own m_instanceLock) when not,
  preserving async out-of-hook callbacks.
- Revert m_callbackLock in LxssUserSessionImpl; MountRootNamespaceFolder/CreateLinuxProcess
  go back to the recursive m_instanceLock. Add LxssUserSessionImpl::TryInvokeUnderInstanceLock
  (timed try_lock_for on m_instanceLock) used by the direct path.
- Deadlock fix: a callback racing the pre-notification window (lock held, pump not yet
  registered) must not block on m_instanceLock - InvokeOnWslPump uses a timed acquire and
  re-checks for a pump, so the racing callback is routed onto the pump instead of dead-blocking.
- Hardening (from multi-model review): PluginCallPump::Invoke reports ran-vs-stopped
  out-of-band (no RPC_E_DISCONNECTED sentinel overload, avoids double-execution); the pump
  loop polls workerDone each iteration so callback traffic cannot starve the stop path;
  Run is exception-safe so std::thread creation failure never escapes into teardown hooks.
- WSLC session-ref registry left as-is (documented follow-up): its lock is non-recursive,
  notifications already fire outside it, and it is entangled with the create-veto protocol.
- Update test/plugin comments to describe the pump model (assertions unchanged and still pass).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@benhillis benhillis force-pushed the user/benhill/plugin_host_threaded branch from 22ab719 to cdb7b18 Compare June 11, 2026 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants