Add native WSL2 VM backend for Windows#667
Conversation
The upstream trellis-cli supports Lima for macOS/Linux local development. This adds a WSL2 backend so Windows developers get the same first-class experience via `trellis vm start`. New WSL2 backend (pkg/wsl/): - Manager implementing vm.Manager using wsl.exe commands - WindowsHostsResolver for hosts file management with UAC elevation - Ubuntu rootfs registry (22.04, 24.04) - Bootstrap installs Python, Ansible, Node.js LTS, Corepack - Project files copied to ext4 for native performance (~80ms vs ~14s TTFB) - Auto-stops other trellis distros (shared network namespace) - SyncBack prompt before stopping other running distros - Breadcrumb file for cross-distro SyncBack support New commands: - vm open: Launch VS Code connected to WSL distro - vm sync: Manual WSL-to-Windows file sync - vm trust: Re-import self-signed SSL certs into Windows trust store Enhanced existing commands: - vm start/stop/delete/shell: WSL2 backend support - db open: Works from both Windows and WSL terminals - provision, deploy, vault, galaxy, xdebug-tunnel: Windows host detection with redirect to WSL terminal Other changes: - Windows os.Rename retry loop for antivirus file locks - rundll32 URI handler (fixes cmd.exe & parsing in URIs) - UTF-16LE decoder for wsl.exe output
…backend feat: Add native WSL2 virtual machine backend for Windows
# Conflicts: # README.md
The isProvisioned check relied solely on an external .provisioned marker file. Distros provisioned before the marker system was introduced (or whose marker was lost) were incorrectly identified as unprovisioned and silently deleted on the next vm start. Changes: - isProvisioned() now has a two-tier check: marker file first, then falls back to checking /etc/trellis-project-root (breadcrumb written during bootstrap) inside the distro. Self-heals the marker on success. - vm start now prompts for confirmation before deleting a distro that appears unprovisioned, instead of silently deleting it.
…fety Fix vm start silently deleting provisioned WSL distros
…stall - Add WSL guard to 'trellis init': detects WSL backend and prints a message explaining that dependencies are managed inside the VM automatically, instead of failing with a virtualenv error. - Improve windowsHostRequired() guard message: echoes back the actual command to run (e.g. "Run 'trellis vm start' from Windows PowerShell") instead of the generic "Run this command from PowerShell". - Add windowsHostRequired() guard to 'vm open' and 'vm sync': users running these from inside WSL now get the correct "run from PowerShell" message instead of a confusing "only supported on Windows (WSL2)" error. - Make CLI install in bootstrap upstream-ready: checks for a cross-compiled trellis-linux sidecar first (dev/fork builds), falls back to the official install script (scripts/get) for upstream releases. Previously, if no sidecar was found, the distro silently had no CLI binary.
Improve WSL UX: init guard, guard messages, and upstream-ready CLI install
|
@qwatts-dev thanks for your contribution here; this is pretty cool to see. However, it's not something we can officially support and integrate into trellis-cli. The problem with this model of Windows support is it effectively creates an entirely separate mode of execution compare to what exists now with Ansible running on the host. This would either require a complete refactor or conditionals in almost every single command like this PR has now. It would just create a much more complex codebase and harder to maintain and test. We'd be interested in exploring QEMU nested virtualization more and see if there's any way to improve performance there. If that was usable then it would only require minimal changes to officially support it. |
|
Thanks for the thorough explanation @swalkinshaw! I genuinely appreciate it, and the reasoning makes total sense. As I was setting the conditionals, I was thinking "man this will change the experience for Windows devs a LOT", haha. So, I knew that would be the awkward part of this PR. For the QEMU nested virtualization direction - funny timing! I actually built a working QEMU on WSL bash shim for my team before attempting this native WSL path (talked about it a bit in my Roots Discourse post). Just a heads up as you guys explore that route for Windows users: the biggest hurdle we ran into was that Windows browsers can't natively route to the internal network that QEMU sets up inside WSL2. My shim (about 1,500 lines of bash) hacked around this using SSH tunnels and port forwarding to expose the VM to Windows, plus some UAC prompts to manage the Windows hosts file and SSL cert trust (similar to how this fork/PR handled SSL). It works, but that heavy complexity is actually what pushed me toward trying this native WSL path instead. All in all, it's been an awesome experience contributing! As y'all explore QEMU nested virtualizations, I'm happy to dump my existing bash scripts and WSL network notes here or in an issue. |
Replaces #665 (moved to a branch per @retlehs' request so maintainers can make direct edits).
Summary
Adds a
wslVM backend so Windows users get a nativetrellis vmexperience using WSL2 — no nested VMs, no Vagrant, no VirtualBox.This mirrors the Lima backend's role on macOS/Linux: each Trellis project gets its own isolated Ubuntu 24.04 environment with project files on ext4, Ansible running locally inside the distro, and ports accessible directly from Windows.
Discourse thread: https://discourse.roots.io/t/native-wsl2-vm-backend-for-trellis-on-windows-looking-for-testers/30281
Motivation
Windows users currently have two options, both painful:
trellis vmintegration, no auto-provisioning, no config sync.WSL2 is already a VM — this backend uses it directly instead of nesting another VM inside it.
What's new
New files
pkg/wsl/manager.govm.Managerimplementation (~1150 lines)pkg/wsl/hosts.gopkg/wsl/ubuntu.gocmd/vm_open.go--folder-uri vscode-remote://wsl+<distro>/pathcmd/vm_sync.gocmd/vm_trust.goModified files
cmd/vm.gocase "wsl"innewVmManager()+ two guard functionscmd/vm_start.gocmd/vm_stop.gocmd/vm_delete.gowindowsHostRequired()guardcmd/vm_shell.gowindowsHostRequired()guardcmd/init.gotrellis/trellis.goVmManagerType()returns"wsl"on Windows, WSL auto-detection,CheckVirtualenvskipcmd/db_open.gomysql://URI for WSL (no SSH tunnel needed)pkg/db_opener/tableplus.gorundll32.exeURI opening for Windowsgithub/main.goos.Rename(Windows antivirus file locks)cmd/provision.go,cmd/deploy.go, etc.wslTerminalRequired()guard on Ansible commandsmain.govm open,vm sync,vm trustcommandsDesign decisions
wsl.Managerimplementsvm.Manageridentically tolima.Manager. All WSL-specific code lives inpkg/wsl/— no Windows logic scattered elsewhere.VmManagerType()returns"wsl"whenruntime.GOOS == "windows"and the manager is"auto". No user configuration needed.trellis-<site>). Projects are rsync'd to ext4 at/home/admin/<project>/.ansible_connection=localwithansible_user=admin. No SSH keys, no tunnels.wslTerminalRequired()— redirects Ansible commands from Windows → "runtrellis vm openfirst"windowsHostRequired()— redirects VM management from WSL → "runtrellis <command>from Windows PowerShell"StartInstanceprompts to sync and stop other runningtrellis-*distros.syncConfigFromWSL()rsyncsgroup_vars/from ext4→Windows on manager init, keeping the Windows-side repo current.trellisinside the distro via the official install script (scripts/get). A dev override checks for a local cross-compiled binary first (for testing from source before a release exists).Changes since #665
trellis init— prints a clear message instead of failing with a virtualenv errorwindowsHostRequired()guard to echo back the actual command (e.g. "Run 'trellis vm start' from Windows PowerShell")windowsHostRequired()guard tovm openandvm sync(were showing confusing "only supported on Windows" when run from WSL)scripts/get(upstream releases)vm startsilently deleting provisioned distros that predate the marker file system (two-tier provisioned check with confirmation prompt)Testing
Tested on Windows 11 with WSL2:
trellis new) — full bootstrap + provision + site loads ✅ (also verified by @retlehs in Add native WSL2 VM backend for Windows #665)vm start/stop/shell/open/deletelifecycleprovision,deploy,db opencommandstrellis initguard message on Windows ✅Checklist
go vet ./...passesgolangci-lint runpasses (0 issues)commandpackage for exec,colorfor output,promptuifor prompts)runtime.GOOS == "windows"orWSL_DISTRO_NAMEchecks)