Skip to content

Patch kubevirt with hotplug detach deadlock and PCI passthrough fixes#17689

Open
maxwmsft wants to merge 2 commits into
microsoft:3.0-devfrom
maxwmsft:kubevirt-hotplug-fixes
Open

Patch kubevirt with hotplug detach deadlock and PCI passthrough fixes#17689
maxwmsft wants to merge 2 commits into
microsoft:3.0-devfrom
maxwmsft:kubevirt-hotplug-fixes

Conversation

@maxwmsft

Copy link
Copy Markdown

Adds three patches to kubevirt addressing hotplug-volume-related issues observed
in production on KubeVirt v1.7.1.

Patches

  • Patch17 — virt-handler PCI passthrough (Release 8): fix a VM with a PCI
    hostdev failing to restart after a hotplug block volume — the cgroup v2 device
    rule rebuild dropped device-plugin nodes (/dev/vfio/*, /dev/bus/usb/*).
    Originally proposed in Kubevirt GPU/PCI passthrough patch #17140 (Woojoong Kim); included here so Kubevirt GPU/PCI passthrough patch #17140 can be
    closed in favor of this PR.
  • Patch18 — virt-handler mountFromPod() (Release 9): skip mounting hotplug
    volumes no longer in VMI.Spec.Volumes. Previously the reconcile loop
    re-mounted a removed volume every cycle, recreating its block device so
    Unmount() could never clean it up and IsMounted() never returned false.
    The phase stayed at Ready/MountedToPod, so virt-controller never deleted
    the attachment pod → deadlock. With the skip, Unmount() cleans up, the phase
    advances to UnMountedFromPod, and the attachment pod is deleted.
  • Patch19 — virt-controller cleanupAttachmentPods() (Release 9): only keep
    an old Running attachment pod as a fallback if it still holds a volume worth
    preserving (in-spec, or in a deletion-blocking phase). Previously any old
    Running pod was kept, causing cross-VMI RWO PVC deadlocks during reshuffling.

Patches 18 and 19 include unit tests adapted to v1.7.1. The spec Release ends
at 9.

Testing

  • %prep applies all patches cleanly via %autosetup -p1.
  • Full rpmbuild -ba succeeds in an AzureLinux 3.0 container (golang 1.26.4);
    all 12 sub-RPMs produced, built against the canonical Source0 (sha256 matches
    kubevirt.signatures.json).
  • The fixes are confirmed compiled into the virt-handler and virt-controller
    binaries.
  • Unit tests pass: pkg/virt-handler/hotplug-disk 42/42,
    pkg/virt-controller/watch/vmi 227/227.

woojoong88 and others added 2 commits June 10, 2026 21:43
Add two virt-handler/virt-controller patches addressing hotplug volume
detach deadlocks observed in production on KubeVirt v1.7.1 (ICM
21000001017910 and 21000001021380):

- 0002: virt-handler mountFromPod() skips mounting volumes no longer in
  VMI spec, so it stops resurrecting the block device of a removed
  volume each reconcile. Unmount() can then clean it up, IsMounted()
  returns false, and updateHotplugVolumeStatus() advances the phase to
  UnMountedFromPod, letting virt-controller delete the attachment pod.
- 0003: virt-controller cleanupAttachmentPods() only keeps an old
  Running attachment pod as fallback if it still holds volumes worth
  preserving (in-spec, or in a deletion-blocking phase), avoiding
  cross-VMI RWO deadlocks during PVC reshuffling.

Both patches include unit tests adapted to v1.7.1. Bumps Release to 9.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@maxwmsft maxwmsft requested a review from a team as a code owner June 11, 2026 07:30
@microsoft-github-policy-service microsoft-github-policy-service Bot added Packaging 3.0-dev PRs Destined for AzureLinux 3.0 labels Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.0-dev PRs Destined for AzureLinux 3.0 Packaging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants