Patch kubevirt with hotplug detach deadlock and PCI passthrough fixes#17689
Open
maxwmsft wants to merge 2 commits into
Open
Patch kubevirt with hotplug detach deadlock and PCI passthrough fixes#17689maxwmsft wants to merge 2 commits into
maxwmsft wants to merge 2 commits into
Conversation
Add two virt-handler/virt-controller patches addressing hotplug volume detach deadlocks observed in production on KubeVirt v1.7.1 (ICM 21000001017910 and 21000001021380): - 0002: virt-handler mountFromPod() skips mounting volumes no longer in VMI spec, so it stops resurrecting the block device of a removed volume each reconcile. Unmount() can then clean it up, IsMounted() returns false, and updateHotplugVolumeStatus() advances the phase to UnMountedFromPod, letting virt-controller delete the attachment pod. - 0003: virt-controller cleanupAttachmentPods() only keeps an old Running attachment pod as fallback if it still holds volumes worth preserving (in-spec, or in a deletion-blocking phase), avoiding cross-VMI RWO deadlocks during PVC reshuffling. Both patches include unit tests adapted to v1.7.1. Bumps Release to 9. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds three patches to kubevirt addressing hotplug-volume-related issues observed
in production on KubeVirt v1.7.1.
Patches
hostdev failing to restart after a hotplug block volume — the cgroup v2 device
rule rebuild dropped device-plugin nodes (
/dev/vfio/*,/dev/bus/usb/*).Originally proposed in Kubevirt GPU/PCI passthrough patch #17140 (Woojoong Kim); included here so Kubevirt GPU/PCI passthrough patch #17140 can be
closed in favor of this PR.
mountFromPod()(Release 9): skip mounting hotplugvolumes no longer in
VMI.Spec.Volumes. Previously the reconcile loopre-mounted a removed volume every cycle, recreating its block device so
Unmount()could never clean it up andIsMounted()never returned false.The phase stayed at
Ready/MountedToPod, so virt-controller never deletedthe attachment pod → deadlock. With the skip,
Unmount()cleans up, the phaseadvances to
UnMountedFromPod, and the attachment pod is deleted.cleanupAttachmentPods()(Release 9): only keepan old Running attachment pod as a fallback if it still holds a volume worth
preserving (in-spec, or in a deletion-blocking phase). Previously any old
Running pod was kept, causing cross-VMI RWO PVC deadlocks during reshuffling.
Patches 18 and 19 include unit tests adapted to v1.7.1. The spec
Releaseendsat 9.
Testing
%prepapplies all patches cleanly via%autosetup -p1.rpmbuild -basucceeds in an AzureLinux 3.0 container (golang 1.26.4);all 12 sub-RPMs produced, built against the canonical Source0 (sha256 matches
kubevirt.signatures.json).virt-handlerandvirt-controllerbinaries.
pkg/virt-handler/hotplug-disk42/42,pkg/virt-controller/watch/vmi227/227.