diagnostics: improve collect-wsl-logs for analysis (summary.json, README, profile info, WSL/guest state)#40776
Open
benhillis wants to merge 5 commits into
Open
Conversation
collect-wsl-logs.ps1 did not record which WPR profile was used for a capture. When analyzing an archive (e.g. a networking-only capture that lacks the WSL core trace providers), there was no way to tell which profile produced it without inferring it from the provider mix. Write a collection-info.txt into the log folder capturing the selected LogProfile, the mapped WPRP profile and file, the Dump and RestartWslReproMode switches, and the collection timestamp. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves diagnostic trace archives by writing a collection-info.txt file into the collected log folder, capturing which capture/profile settings were used so downstream analysis doesn’t have to infer the profile from provider mix.
Changes:
- Add a
collection-info.txtmetadata block to the log folder describing the selectedLogProfile, resolved WPRP profile/file, relevant switches, and timestamp.
Add three further debugging improvements to collect-wsl-logs.ps1: - Collect wsl --version / --status / --list --verbose into wsl-info.txt instead of forcing analyzers to infer the version and distro layout from the appx package and registry. - Collect guest-side state (dmesg, free, uptime, ulimit, pid_max, threads-max, process/thread counts, top RSS) into linux_diagnostics.log after the repro. This is the data needed to diagnose in-distro failures such as 'Resource temporarily unavailable' (EAGAIN) from resource limits. - Remove 0-byte dump files left behind when MiniDumpWriteDump fails, so the archive only contains real dumps. Also set WSL_UTF8 and the console output encoding so wsl.exe output is captured to the log files readably. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make collected log archives easier to analyze (by a human or an agent) without having to run tools or infer state from individual artifacts: - summary.json: machine-readable overview of the capture - profile, WSL/Windows versions, networking mode, installed distributions and their state, .wslconfig presence, and an inventory of non-empty dumps. - README.md: an index of the archive contents describing each file, plus a note on how to decode logs.etl and how to tell when a non-default log profile was used. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
OneBlue
reviewed
Jun 11, 2026
…immer summary.json) - Write wsl-info.txt as UTF-8 instead of the PS5.1 default UTF-16LE. - Guard every newly-added wsl.exe call (wsl-info and guest diagnostics) with a timeout via a background job so a deadlocked service or bad VM state cannot hang log collection. - Drop the duplicated distro-registry enumeration and .wslconfig networkingMode parsing from summary.json; that state is readily derived from HKCU.txt and the archived .wslconfig. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the unconditional wsl.exe calls this PR introduced (wsl-info.txt and linux_diagnostics.log) along with the now-unused timeout helper, per review feedback that the log collection script should not call wsl.exe (which can hang if the service is deadlocked or the VM is in a bad state). Pre-existing networking-profile wsl.exe calls are left untouched. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment on lines
+427
to
+445
| # Keep this summary basic: profile, versions and the dump inventory. State that | ||
| # can be readily derived from the rest of the archive (the installed | ||
| # distributions in HKCU.txt, the networking mode in .wslconfig) is intentionally | ||
| # not duplicated here. | ||
| $summary = [ordered]@{ | ||
| collectedAt = (Get-Date -Format "o") | ||
| logProfile = $logProfileDisplay | ||
| wprpProfile = $wprpProfileDisplay | ||
| dump = [bool]$Dump | ||
| restartWslReproMode = [bool]$RestartWslReproMode | ||
| wslVersion = (Get-Prop $appx "Version") | ||
| windows = [ordered]@{ | ||
| build = "$(Get-Prop $winCV 'CurrentBuild').$(Get-Prop $winCV 'UBR')" | ||
| displayVersion = Get-Prop $winCV "DisplayVersion" | ||
| edition = Get-Prop $winCV "EditionID" | ||
| } | ||
| wslConfigPresent = [bool](Test-Path $wslconfig) | ||
| dumps = $dumps | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Several improvements to
collect-wsl-logs.ps1to make collected log archives more useful for debugging (by a human or an agent), motivated by a real investigation where the captured logs were in formats that were hard to analyze and lacked the context needed to diagnose the reported error.Changes
1. Record the capture profile (
collection-info.txt)Previously there was no record of which WPR profile a capture used, so it had to be inferred from the trace provider mix — easy to get wrong. For example a
-LogProfile networkingcapture only contains VfpExt/HNS events and none of the WSL core trace providers, which can send an investigation down the wrong path. We now write acollection-info.txtrecording the selectedLogProfile, the mapped WPRP profile and file, theDump/RestartWslReproModeswitches, and the collection timestamp.2. Machine-readable
summary.jsonA single structured file giving an immediate overview without running any tools: profile, WSL and Windows versions, networking mode, installed distributions and their state,
.wslconfigpresence, and an inventory of (non-empty) dumps.3. Archive index (
README.md)Describes each artifact in the archive, how to decode
logs.etl, and how to recognize when a non-default log profile was used.4. Collect WSL version and distro state (
wsl-info.txt)Capture
wsl --version,wsl --statusandwsl --list --verbose, instead of forcing analyzers to infer the version and distro layout from the appx package and registry.5. Collect guest-side diagnostics (
linux_diagnostics.log)After the repro, collect
dmesg,free -m,uptime,ulimit -a,pid_max,threads-max, process/thread counts and top processes by RSS. This is the data needed to diagnose in-distro failures such asResource temporarily unavailable(EAGAIN) caused by hitting process/thread/fd/memory limits — which the default capture never collected. Best-effort: skipped when WSL isn't installed or no distro is available.6. Drop empty dump files
When
MiniDumpWriteDumpfails it left a 0-byte.dmpbehind, so archives could contain many misleading empty dumps. Remove the file on failure.Also sets
WSL_UTF8and the console output encoding sowsl.exeoutput is captured to the log files readably.Testing
[Parser]::ParseFile).collection-info.txtandsummary.jsonrender as expected with real data (and that the summary logic is safe underSet-StrictMode -Version Latest).wsl --versionand the guest-diagnostics commands run and produce readable UTF-8 output.