Skip to content

Add native WSL2 VM backend for Windows#665

Closed
qwatts-dev wants to merge 8 commits intoroots:masterfrom
qwatts-dev:master
Closed

Add native WSL2 VM backend for Windows#665
qwatts-dev wants to merge 8 commits intoroots:masterfrom
qwatts-dev:master

Conversation

@qwatts-dev
Copy link
Copy Markdown

Summary

Adds a wsl VM backend so Windows users get a native trellis vm experience using WSL2 — no nested VMs, no Vagrant, no VirtualBox.

This mirrors the Lima backend's role on macOS/Linux: each Trellis project gets its own isolated Ubuntu 24.04 environment with project files on ext4, Ansible running locally inside the distro, and ports accessible directly from Windows.

Discourse thread: https://discourse.roots.io/t/native-wsl2-vm-backend-for-trellis-on-windows-looking-for-testers/30281/2

Motivation

Windows users currently have two options, both painful:

  1. Lima inside WSL2 — QEMU nested virtualization is slow (~14s TTFB on DrvFS), VMs break when WSL sleeps, and port forwarding is fragile.
  2. Manual WSL setup — No trellis vm integration, no auto-provisioning, no config sync.

WSL2 is already a VM — this backend uses it directly instead of nesting another VM inside it.

What's new

New files

File Purpose
pkg/wsl/manager.go Core vm.Manager implementation (~1150 lines)
pkg/wsl/hosts.go Windows hosts file management with UAC elevation
pkg/wsl/ubuntu.go Ubuntu rootfs URL registry (22.04, 24.04)
cmd/vm_open.go Opens VS Code via --folder-uri vscode-remote://wsl+<distro>/path
cmd/vm_sync.go Manual WSL→Windows rsync sync
cmd/vm_trust.go Re-imports SSL certs into Windows trust store

Modified files

File Change
cmd/vm.go case "wsl" in newVmManager() + two guard functions
cmd/vm_start.go WSL bootstrap/provision flow, unprovisioned distro cleanup
cmd/vm_stop.go Auto SyncBack before stop
cmd/vm_delete.go windowsHostRequired() guard
cmd/vm_shell.go windowsHostRequired() guard
trellis/trellis.go VmManagerType() returns "wsl" on Windows, WSL auto-detection, CheckVirtualenv skip
cmd/db_open.go Direct mysql:// URI for WSL (no SSH tunnel needed)
pkg/db_opener/tableplus.go rundll32.exe URI opening for Windows
github/main.go Retry loop for os.Rename (Windows antivirus file locks)
cmd/provision.go, cmd/deploy.go, etc. wslTerminalRequired() guard on Ansible commands
main.go Register new vm open, vm sync, vm trust commands

Design decisions

  • Follows the Lima pattern. The wsl.Manager implements vm.Manager identically to lima.Manager. All WSL-specific code lives in pkg/wsl/ — no Windows logic scattered elsewhere.
  • Auto-detected. VmManagerType() returns "wsl" when runtime.GOOS == "windows" and the manager is "auto". No user configuration needed.
  • Project isolation. Each project gets its own WSL2 distro (named trellis-<site>). Projects are rsync'd to ext4 at /home/admin/<project>/.
  • No SSH. Uses ansible_connection=local with ansible_user=admin. No SSH keys, no tunnels.
  • Two guard functions keep users on the right track:
    • wslTerminalRequired() — redirects Ansible commands from Windows → "run trellis vm open first"
    • windowsHostRequired() — redirects VM management from WSL → "run this from Windows"
  • One project at a time. WSL2 distros share a network namespace (Microsoft by-design). StartInstance prompts to sync and stop other running trellis-* distros.
  • Config sync. syncConfigFromWSL() rsyncs group_vars/ from ext4→Windows on manager init, keeping the Windows-side repo current.

Testing

Tested on Windows 11 with WSL2:

  • Fresh project (trellis new) — full bootstrap + provision + site loads
  • Existing production site (Sage theme, ACF Pro, restored database)
  • vm start / stop / shell / open / delete lifecycle
  • provision, deploy, db open commands
  • SSL cert trust import
  • Config sync (group_vars roundtrip)
  • Multiple distro switching (auto stop/sync of other projects)

Pre-built binaries available for hands-on testing: https://github.com/qwatts-dev/trellis-cli/releases/tag/v1.18.0-wsl2.1

Checklist

  • go vet ./... passes
  • Follows existing code patterns (command package for exec, color for output, promptui for prompts)
  • No changes to Lima backend behavior
  • macOS/Linux codepaths unaffected (WSL code gated behind runtime.GOOS == "windows" or WSL_DISTRO_NAME checks)

The upstream trellis-cli supports Lima for macOS/Linux local development.
This adds a WSL2 backend so Windows developers get the same first-class
experience via `trellis vm start`.

New WSL2 backend (pkg/wsl/):
- Manager implementing vm.Manager using wsl.exe commands
- WindowsHostsResolver for hosts file management with UAC elevation
- Ubuntu rootfs registry (22.04, 24.04)
- Bootstrap installs Python, Ansible, Node.js LTS, Corepack
- Project files copied to ext4 for native performance (~80ms vs ~14s TTFB)
- Auto-stops other trellis distros (shared network namespace)
- SyncBack prompt before stopping other running distros
- Breadcrumb file for cross-distro SyncBack support

New commands:
- vm open: Launch VS Code connected to WSL distro
- vm sync: Manual WSL-to-Windows file sync
- vm trust: Re-import self-signed SSL certs into Windows trust store

Enhanced existing commands:
- vm start/stop/delete/shell: WSL2 backend support
- db open: Works from both Windows and WSL terminals
- provision, deploy, vault, galaxy, xdebug-tunnel: Windows host
  detection with redirect to WSL terminal

Other changes:
- Windows os.Rename retry loop for antivirus file locks
- rundll32 URI handler (fixes cmd.exe & parsing in URIs)
- UTF-16LE decoder for wsl.exe output
…backend

feat: Add native WSL2 virtual machine backend for Windows
The isProvisioned check relied solely on an external .provisioned marker file. Distros provisioned before the marker system was introduced (or whose marker was lost) were incorrectly identified as unprovisioned and silently deleted on the next vm start.

Changes:

- isProvisioned() now has a two-tier check: marker file first, then falls back to checking /etc/trellis-project-root (breadcrumb written during bootstrap) inside the distro. Self-heals the marker on success.

- vm start now prompts for confirmation before deleting a distro that appears unprovisioned, instead of silently deleting it.
…fety

Fix vm start silently deleting provisioned WSL distros
@retlehs
Copy link
Copy Markdown
Member

retlehs commented Apr 8, 2026

@qwatts-dev Great work and thanks for the PR! Steps I took to test on a PC:

I had to install Go for Windows along with Python

  1. I cloned your fork
  2. Ran the go build from PowerShell
  3. Confirmed trellis-cli.exe exists
  4. Ran trellis-cli.exe new
  5. Tried to run trellis-cli.exe init from PowerShell: ❌
PS C:\Users\Howdy\Projects\trellis-cli\windows.com\trellis> ..\..\trellis-cli.exe init
Initializing project...

[✓] Created virtualenv (C:\Users\Howdy\Projects\trellis-cli\windows.com\trellis\.trellis\virtualenv)
[✓] Ensure pip is up to date
[✘] Error installing dependencies

Switched over to WSL:

  1. Installed Go and Python with apt
  2. Ran trellis-linux init - first attempt got stuck after trying install deps, 2nd attempt worked
ben@DESKTOP-MKKS5TI:/mnt/c/Users/Howdy/Projects/trellis-cli/windows.com$ ../trellis-linux vm start
'trellis vm start' manages the WSL distro from the Windows host.
Run this command from your Windows PowerShell or Command Prompt, not from inside WSL.

Shouldn't this output explain to run "trellis-cli.exe"?

Switched back over to PowerShell:

PS C:\Users\Howdy\Projects\trellis-cli\windows.com> ..\trellis-cli.exe vm start

Provision completed without issues, local site loads as expected ✅

Why is there a separate trellis-linux binary instead of using the trellis-cli binary?

@qwatts-dev
Copy link
Copy Markdown
Author

@retlehs Thanks for testing this, and great to hear provision and site loading worked!

Addressing your questions:

trellis init failing on Windows:

This is expected.. init installs Python/Ansible on the host, which isn't needed with the WSL backend since those are installed inside the distro during vm start. I intentionally skip init during trellis new for this reason, but I didn't add a guard for running init standalone.. whoops. I'll add one that gives a clear message explaining that the WSL backend handles dependencies inside the VM automatically.

Guard message.. "Shouldn't this output explain to run trellis-cli.exe?":

Good catch on the UX. The message is technically accurate ("Run this command from Windows PowerShell, not from inside WSL"), but I agree it could be clearer about what to do. I'll improve the wording to echo back the actual command to run.
Note: I was intentionally not hardcoding a specific binary name since users will just have trellis (or optionally trellis.exe on Windows, I believe) in their PATH when this is merged upstream..

"Why is there a separate trellis-linux binary?"

So, this is purely a dev/fork convenience.. The WSL distro needs a Linux build of
trellis-cli so we can run commands like trellis provision and trellis db open from VS Code's WSL terminal and apply the new wsl features added to those commands. Since the upstream install script can't fetch a binary with my fork's changes, I cross-compile a Linux binary and copy it in during bootstrap.

For upstream, this would be replaced by the official install script, which already supports Linux.. bootstrap would just run it inside the distro to install the correct platform binary. No separate trellis-linux needed.

@retlehs
Copy link
Copy Markdown
Member

retlehs commented Apr 8, 2026

👍 no need to tweak the messaging to account for “trellis.exe”

@retlehs
Copy link
Copy Markdown
Member

retlehs commented Apr 8, 2026

Do you want to make a new PR based off a branch on your fork so that @swalkinshaw and I can make direct edits without affecting your fork for now? (Make sure to allow maintainers to make edits when opening the PR)

…stall

- Add WSL guard to 'trellis init': detects WSL backend and prints a
  message explaining that dependencies are managed inside the VM
  automatically, instead of failing with a virtualenv error.

- Improve windowsHostRequired() guard message: echoes back the actual
  command to run (e.g. "Run 'trellis vm start' from Windows PowerShell")
  instead of the generic "Run this command from PowerShell".

- Add windowsHostRequired() guard to 'vm open' and 'vm sync': users
  running these from inside WSL now get the correct "run from PowerShell"
  message instead of a confusing "only supported on Windows (WSL2)" error.

- Make CLI install in bootstrap upstream-ready: checks for a cross-compiled
  trellis-linux sidecar first (dev/fork builds), falls back to the official
  install script (scripts/get) for upstream releases. Previously, if no
  sidecar was found, the distro silently had no CLI binary.
Improve WSL UX: init guard, guard messages, and upstream-ready CLI install
@qwatts-dev
Copy link
Copy Markdown
Author

Oh, I'd already pushed a few small ux tweaks before seeing your latest message.. They are:

  • Added a guard to trellis init so it explains the WSL backend handles dependencies automatically instead of failing with a virtualenv error
  • Improved the windowsHostRequired() guard message to echo back the actual command (now it reads like: "Run trellis vm start from Windows PowerShell" or "Run trellis vm shell from PowerShell", etc.)
  • Added the same guard to vm open and vm sync (they were showing a confusing "only supported on Windows" message when run from inside WSL)
  • Made the CLI install in bootstrap upstream-ready.. it now checks for the dev trellis-linux binary first, then falls back to the official install script (scripts/get) for upstream releases

I'll close this PR and open a new one from a branch on my fork with maintainer edits enabled. One sec...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants