Skip to content

diagnostics: improve collect-wsl-logs for analysis (summary.json, README, profile info, WSL/guest state)#40776

Open
benhillis wants to merge 5 commits into
microsoft:masterfrom
benhillis:benhillis/log-collection-profile-info
Open

diagnostics: improve collect-wsl-logs for analysis (summary.json, README, profile info, WSL/guest state)#40776
benhillis wants to merge 5 commits into
microsoft:masterfrom
benhillis:benhillis/log-collection-profile-info

Conversation

@benhillis

@benhillis benhillis commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary

Several improvements to collect-wsl-logs.ps1 to make collected log archives more useful for debugging (by a human or an agent), motivated by a real investigation where the captured logs were in formats that were hard to analyze and lacked the context needed to diagnose the reported error.

Changes

1. Record the capture profile (collection-info.txt)
Previously there was no record of which WPR profile a capture used, so it had to be inferred from the trace provider mix — easy to get wrong. For example a -LogProfile networking capture only contains VfpExt/HNS events and none of the WSL core trace providers, which can send an investigation down the wrong path. We now write a collection-info.txt recording the selected LogProfile, the mapped WPRP profile and file, the Dump/RestartWslReproMode switches, and the collection timestamp.

2. Machine-readable summary.json
A single structured file giving an immediate overview without running any tools: profile, WSL and Windows versions, networking mode, installed distributions and their state, .wslconfig presence, and an inventory of (non-empty) dumps.

3. Archive index (README.md)
Describes each artifact in the archive, how to decode logs.etl, and how to recognize when a non-default log profile was used.

4. Collect WSL version and distro state (wsl-info.txt)
Capture wsl --version, wsl --status and wsl --list --verbose, instead of forcing analyzers to infer the version and distro layout from the appx package and registry.

5. Collect guest-side diagnostics (linux_diagnostics.log)
After the repro, collect dmesg, free -m, uptime, ulimit -a, pid_max, threads-max, process/thread counts and top processes by RSS. This is the data needed to diagnose in-distro failures such as Resource temporarily unavailable (EAGAIN) caused by hitting process/thread/fd/memory limits — which the default capture never collected. Best-effort: skipped when WSL isn't installed or no distro is available.

6. Drop empty dump files
When MiniDumpWriteDump fails it left a 0-byte .dmp behind, so archives could contain many misleading empty dumps. Remove the file on failure.

Also sets WSL_UTF8 and the console output encoding so wsl.exe output is captured to the log files readably.

Testing

  • Script parses cleanly ([Parser]::ParseFile).
  • Verified collection-info.txt and summary.json render as expected with real data (and that the summary logic is safe under Set-StrictMode -Version Latest).
  • Verified wsl --version and the guest-diagnostics commands run and produce readable UTF-8 output.

collect-wsl-logs.ps1 did not record which WPR profile was used for a
capture. When analyzing an archive (e.g. a networking-only capture that
lacks the WSL core trace providers), there was no way to tell which
profile produced it without inferring it from the provider mix.

Write a collection-info.txt into the log folder capturing the selected
LogProfile, the mapped WPRP profile and file, the Dump and
RestartWslReproMode switches, and the collection timestamp.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 11, 2026 17:07
@benhillis benhillis requested a review from a team as a code owner June 11, 2026 17:07

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves diagnostic trace archives by writing a collection-info.txt file into the collected log folder, capturing which capture/profile settings were used so downstream analysis doesn’t have to infer the profile from provider mix.

Changes:

  • Add a collection-info.txt metadata block to the log folder describing the selected LogProfile, resolved WPRP profile/file, relevant switches, and timestamp.

Add three further debugging improvements to collect-wsl-logs.ps1:

- Collect wsl --version / --status / --list --verbose into wsl-info.txt
  instead of forcing analyzers to infer the version and distro layout
  from the appx package and registry.
- Collect guest-side state (dmesg, free, uptime, ulimit, pid_max,
  threads-max, process/thread counts, top RSS) into linux_diagnostics.log
  after the repro. This is the data needed to diagnose in-distro failures
  such as 'Resource temporarily unavailable' (EAGAIN) from resource limits.
- Remove 0-byte dump files left behind when MiniDumpWriteDump fails, so
  the archive only contains real dumps.

Also set WSL_UTF8 and the console output encoding so wsl.exe output is
captured to the log files readably.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@benhillis benhillis changed the title diagnostics: record capture profile in collected logs diagnostics: improve collect-wsl-logs (capture profile, WSL/guest state, drop empty dumps) Jun 11, 2026
Make collected log archives easier to analyze (by a human or an agent)
without having to run tools or infer state from individual artifacts:

- summary.json: machine-readable overview of the capture - profile,
  WSL/Windows versions, networking mode, installed distributions and
  their state, .wslconfig presence, and an inventory of non-empty dumps.
- README.md: an index of the archive contents describing each file, plus
  a note on how to decode logs.etl and how to tell when a non-default
  log profile was used.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 11, 2026 17:18
@benhillis benhillis changed the title diagnostics: improve collect-wsl-logs (capture profile, WSL/guest state, drop empty dumps) diagnostics: improve collect-wsl-logs for analysis (summary.json, README, profile info, WSL/guest state) Jun 11, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread diagnostics/collect-wsl-logs.ps1 Outdated
Comment thread diagnostics/collect-wsl-logs.ps1
Comment thread diagnostics/collect-wsl-logs.ps1 Outdated
Ben Hillis and others added 2 commits June 11, 2026 13:28
…immer summary.json)

- Write wsl-info.txt as UTF-8 instead of the PS5.1 default UTF-16LE.
- Guard every newly-added wsl.exe call (wsl-info and guest diagnostics) with a
  timeout via a background job so a deadlocked service or bad VM state cannot
  hang log collection.
- Drop the duplicated distro-registry enumeration and .wslconfig networkingMode
  parsing from summary.json; that state is readily derived from HKCU.txt and the
  archived .wslconfig.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the unconditional wsl.exe calls this PR introduced (wsl-info.txt and
linux_diagnostics.log) along with the now-unused timeout helper, per review
feedback that the log collection script should not call wsl.exe (which can hang
if the service is deadlocked or the VM is in a bad state). Pre-existing
networking-profile wsl.exe calls are left untouched.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 11, 2026 21:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment on lines +427 to +445
# Keep this summary basic: profile, versions and the dump inventory. State that
# can be readily derived from the rest of the archive (the installed
# distributions in HKCU.txt, the networking mode in .wslconfig) is intentionally
# not duplicated here.
$summary = [ordered]@{
collectedAt = (Get-Date -Format "o")
logProfile = $logProfileDisplay
wprpProfile = $wprpProfileDisplay
dump = [bool]$Dump
restartWslReproMode = [bool]$RestartWslReproMode
wslVersion = (Get-Prop $appx "Version")
windows = [ordered]@{
build = "$(Get-Prop $winCV 'CurrentBuild').$(Get-Prop $winCV 'UBR')"
displayVersion = Get-Prop $winCV "DisplayVersion"
edition = Get-Prop $winCV "EditionID"
}
wslConfigPresent = [bool](Test-Path $wslconfig)
dumps = $dumps
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants