nvme_driver: Set correct drain state for new queues (#2864) by alandau · Pull Request #2867 · microsoft/openvmm

alandau · 2026-03-02T21:07:13Z

Clean cherry pick of PR #2864

In #2833, saved queues (eager and proto) were given the correct drain state, but new queues were given the AllDrained state meaning they could race before all existing queues have drained. This could happen if IO is received during draining on a CPU (thus queue) that has never seen IO before the save.

This PR fixes the race by giving new queues the correct drain state - either AllDrained if all existing queues have drained already, or SelfDrained, like proto queues (no IO pending on them) if not.

In microsoft#2833, saved queues (eager and proto) were given the correct drain state, but new queues were given the `AllDrained` state meaning they could race before all existing queues have drained. This could happen if IO is received during draining on a CPU (thus queue) that has never seen IO before the save. This PR fixes the race by giving new queues the correct drain state - either `AllDrained` if all existing queues have drained already, or `SelfDrained`, like proto queues (no IO pending on them) if not. (cherry picked from commit 831eb3d)

Copilot

Pull request overview

Fixes a restore-time race in the NVMe driver’s “drain-after-restore” barrier by ensuring newly created I/O queues (created after restore, not from saved state) receive an appropriate drain state instead of incorrectly starting in AllDrained.

Changes:

Add DrainAfterRestoreBuilder::{is_drain_complete, new_for_new_queue} to pick the correct drain state for newly created queues during an active drain.
Thread an explicit DrainAfterRestore into QueuePair::new so new queues don’t implicitly default to AllDrained.
Store a shared DrainAfterRestoreBuilder in the worker during restore so new queues participate in the drain barrier until completion.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs	Adds builder helpers to determine drain completion and compute the correct drain state for brand-new queues; updates `QueuePair::new` to accept an explicit drain state.
vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs	Persists the drain builder across restore so newly created queues can block on draining; uses `new_for_new_queue()` when creating new I/O queues.

alandau requested a review from a team as a code owner March 2, 2026 21:07

Copilot AI review requested due to automatic review settings March 2, 2026 21:07

alandau requested a review from a team as a code owner March 2, 2026 21:07

github-actions bot added the release_1.7.2511 Targets the release/1.7.2511 branch. label Mar 2, 2026

Copilot started reviewing on behalf of alandau March 2, 2026 21:08 View session

alandau enabled auto-merge (squash) March 2, 2026 21:08

mattkur approved these changes Mar 2, 2026

View reviewed changes

Copilot AI reviewed Mar 2, 2026

View reviewed changes

alandau merged commit 5bc290d into microsoft:release/1.7.2511 Mar 2, 2026
56 checks passed

alandau deleted the cherrypick/release/1.7.2511/pr-2864 branch March 2, 2026 23:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvme_driver: Set correct drain state for new queues (#2864)#2867

nvme_driver: Set correct drain state for new queues (#2864)#2867
alandau merged 1 commit intomicrosoft:release/1.7.2511from
alandau:cherrypick/release/1.7.2511/pr-2864

alandau commented Mar 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alandau commented Mar 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants