Conversation
…Ubuntu 24.04 On Ubuntu 24.04, apt-daily-upgrade.timer fires between 06:00-07:00 UTC and triggers needrestart (default installed on 24.04, not 22.04) which auto-restarts services whose libraries were updated. This has been observed to SIGKILL VirtualClient mid-run, desynchronizing packed SPEC CPU experiments and invalidating results. Analysis of ~4,700 Ubuntu 24.04 SYSAUTO VMs showed ~29% restart rate, with 80% concentrated at hour 6 UTC and minute-of-hour distribution uniform 0-59 (signature of RandomizedDelaySec=60min on the timer). Ubuntu 22.04, Windows, and focal VMs showed 0% restarts in the same window. Masking both apt-daily.timer and apt-daily-upgrade.timer (plus their services) as a dependency step at profile startup removes the trigger. Filtered to linux-x64,linux-arm64 via SupportedPlatforms. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
41b7a3c to
3d0e7e0
Compare
ericavella
approved these changes
Apr 23, 2026
AlexWFMS
pushed a commit
that referenced
this pull request
Apr 24, 2026
Problem ------- On Ubuntu 24.04+, the default installation of `needrestart` combined with the `apt-daily-upgrade.timer` (fires daily 06:00-07:00 UTC with RandomizedDelaySec=60min) automatically restarts any service whose shared libraries are updated by unattended upgrades. For long-running Virtual Client workloads this manifests as VC being SIGKILL'd mid-run - observed at ~29% of VMs in CRC SYSAUTO experiments, concentrated at hour 6 UTC (uniform 0-59 minute distribution, matching the apt timer signature). Web / distro research confirmed this behavior is specific to Ubuntu 24.04+: - Ubuntu 24.04+: needrestart installed AND auto-restart-on-unattended-upgrade is the default (this is the regression). - Ubuntu 22.04/22.10/23.04: needrestart installed but list-only in non-interactive mode; not known to cause the issue in the field. - Debian 11/12: needrestart not installed by default. Fix --- Add a best-effort startup hook in ExecuteProfileCommand that: - runs exactly once per VC invocation, immediately after Platform.Initialize - is a no-op on non-Unix platforms - parses the Ubuntu major version out of PRETTY_NAME and only runs on >=24 - masks + stops the four apt-daily units via `bash -c "..."` (double-quoted because .NET Process argument tokenization follows Windows CommandLineToArgvW rules - single quotes do not group) - swallows any exception so VC startup is never blocked by this mitigation - logs telemetry (DisabledLinuxAutoUpdates / DisableLinuxAutoUpdatesFailed) Because the mitigation now runs unconditionally for every profile on the affected OS, the per-profile MaskAptDailyTimers step added to the SPEC CPU and FIO profiles in PR #694 is redundant and has been removed. Bumped VERSION to 3.1.3. Tests ----- Added parameterized coverage for TryGetUbuntuMajorVersion (9 cases, all passing). ExecuteProfileCommandTests: 24/24 pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Prepends a
MaskAptDailyTimersExecuteCommanddependency to the four SPEC CPU profiles (FPRATE, FPSPEED, INTRATE, INTSPEED) that masksapt-daily.timer,apt-daily-upgrade.timer, and their services on Linux (x64 + arm64).Problem
On Ubuntu 24.04, running packed SPEC CPU workloads shows Virtual Client being killed mid-run between 06:00–07:00 UTC, desynchronizing packed workload results. The same experiment configuration on Ubuntu 22.04 does not reproduce.
Root cause chain:
needrestartpackage (22.04 does not). Its default config$nrconf{restart} = ''a''auto-restarts services whose libraries were updated.apt-daily-upgrade.timerfires at 06:00 UTC withRandomizedDelaySec=60min.needrestartkills VC via SIGTERM/SIGKILL, and systemd restarts it, producing a secondPlatform.Initializeevent.Evidence
Queried ~4,700 Ubuntu 24.04 VMs against the production WorkloadDiagnostics/JunoStaging clusters:
RandomizedDelaySec=60min.Platform.Initialize(killed mid-run, not crash-at-start).Change
{ "Type": "ExecuteCommand", "Parameters": { "Scenario": "MaskAptDailyTimers", "SupportedPlatforms": "linux-x64,linux-arm64", "Command": "bash -c ''systemctl mask apt-daily.timer apt-daily-upgrade.timer apt-daily.service apt-daily-upgrade.service; systemctl stop apt-daily.timer apt-daily-upgrade.timer apt-daily.service apt-daily-upgrade.service; exit 0''" } }SupportedPlatformsfilter skips Windows.bash -c ''... ; exit 0''so it is idempotent and non-fatal on non-Ubuntu Linux distros (CentOS/Suse/etc.) where the units do not exist.Smoke test (Ubuntu 24.04.4 LTS Azure VM,
Standard_D4s_v6)Ran the exact
Commandstring viaaz vm run-command:needrestart 3.6-7ubuntu4.5installed; both timersenabled/active;apt-daily-upgrade.timernext-run scheduled at 06:22:56 UTC (matches the observed restart window).systemctl is-enabled→maskedfor all four units;list-timersshows no next-run.Created symlink /etc/systemd/system/... -> /dev/nulllines.Not changed
PERF-SPECJBB.json/PERF-SPECJVM.json/PERF-GPU-SPECVIEW.json/POWER-SPEC*.json— left alone to keep this PR scoped to the four SPEC CPU 2017 profiles where the issue is reported. Can extend later if desired.needrestartconfig itself — masking the timer is a narrower fix than disabling needrestart globally.