nvme_driver: Drain all queues after restore by alandau · Pull Request #2833 · microsoft/openvmm

alandau · 2026-02-24T22:32:19Z

w1 | w1' c1' w2 c2 c1

Guest sends a write request w1. Then we do servicing. Then storvsp duplicates the request into w1', which completes at c1'. If we allow the queue to run freely (aka accept new guest requests), then at this point the guest can issue another write w2 which completes at c2 and only then the original pre-save write completes at c1 overwriting the "latest" write w2, as far as the guest is concerned.

That's the reason we stop the queue before all pre-save requests complete.

But with multiple queues if we drain one queue and start accepting new guest requests on it, while there are other still-draining queues, we can have the same race.

Therefore, with this PR, all queues are drained before new guest requests are accepted to avoid races.

Copilot

Pull request overview

This pull request implements a mechanism to drain all NVMe I/O queues after restore by coordinating completion of in-flight commands across multiple queues before allowing new I/O. The implementation uses a three-state machine (Draining -> SelfDrained -> AllDrained) with a shared atomic counter to synchronize when all queues have completed draining.

Changes:

Introduced DrainAfterRestore enum to track queue draining state during restore
Modified queue handler to block new I/O commands until all queues finish draining existing commands
Added coordination mechanism using shared atomic counter and device interrupt signaling

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File	Description
vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs	Adds DrainAfterRestore state machine, modifies queue handler event loop to support draining mode, updates restore logic to mark empty queues as drained
vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs	Integrates DrainAfterRestore into queue restoration, creates shared drain coordinator for I/O queues, passes appropriate drain state to admin and prototype queues

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

mattkur

Nifty. You asked for feedback on the approach. I think it looks good to me. I left a few nit comments (though I imagine the things I commented on are already on your radar). Thanks for this, Alex!

I think regression testing is "good enough" for this fix, but hope you can also find a way to prove if there is (or is not) a real bug here.

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

github-actions · 2026-02-25T00:31:45Z

At least one Petri test failed.

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

github-actions · 2026-02-26T00:19:10Z

At least one Petri test failed.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

gurasinghMS · 2026-02-26T20:51:42Z

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

                Request(Req),
                Command(Cmd),
                Completion(spec::Completion),
+                NoOp,


nit: How about naming this DrainComplete (or something like that)?

gurasinghMS · 2026-02-26T21:04:20Z

Feel free to take/ignore the nit comments. Overall logic looks pretty sound to me! (Appreciate the detailed comments btw)

mattkur

I can only find nits. I think this looks good. Nice work, and a tricky area. Thanks Alex!

mattkur · 2026-02-26T20:58:43Z

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

+            .iter()
+            .filter(|q| !q.queue_data.handler_data.pending_cmds.commands.is_empty())
+            .count();
+        tracing::info!(nonempty_queues, "drain-after-restore initialization");


nit, please include ?pci_id as a field.

mattkur · 2026-02-26T22:36:17Z

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

+        let old_counter = counter.fetch_sub(1, Ordering::AcqRel);
+        if old_counter == 1 {
+            signal.signal_uncached();
+            tracing::info!(


minor, include pci_id here (unless this is really painful to plumb; can happen in a separate PR later)

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

alandau · 2026-02-26T23:20:36Z

Addressed nits, thanks Matt & Guramrit!

``` w1 | w1' c1' w2 c2 c1 ``` Guest sends a write request w1. Then we do servicing. Then storvsp duplicates the request into w1', which completes at c1'. If we allow the queue to run freely (aka accept new guest requests), then at this point the guest can issue another write w2 which completes at c2 and only then the original pre-save write completes at c1 overwriting the "latest" write w2, as far as the guest is concerned. That's the reason we stop the queue before all pre-save requests complete. But with multiple queues if we drain one queue and start accepting new guest requests on it, while there are other still-draining queues, we can have the same race. Therefore, with this PR, all queues are drained before new guest requests are accepted to avoid races. (cherry picked from commit c1541ef)

Clean cherry pick of PR #2833 ``` w1 | w1' c1' w2 c2 c1 ``` Guest sends a write request w1. Then we do servicing. Then storvsp duplicates the request into w1', which completes at c1'. If we allow the queue to run freely (aka accept new guest requests), then at this point the guest can issue another write w2 which completes at c2 and only then the original pre-save write completes at c1 overwriting the "latest" write w2, as far as the guest is concerned. That's the reason we stop the queue before all pre-save requests complete. But with multiple queues if we drain one queue and start accepting new guest requests on it, while there are other still-draining queues, we can have the same race. Therefore, with this PR, all queues are drained before new guest requests are accepted to avoid races. Co-authored-by: Alex Landau <alexlandau@microsoft.com>

benhillis · 2026-02-27T23:49:15Z

Backported to release/1.7.2511 in #2851

In #2833, saved queues (eager and proto) were given the correct drain state, but new queues were given the `AllDrained` state meaning they could race before all existing queues have drained. This could happen if IO is received during draining on a CPU (thus queue) that has never seen IO before the save. This PR fixes the race by giving new queues the correct drain state - either `AllDrained` if all existing queues have drained already, or `SelfDrained`, like proto queues (no IO pending on them) if not.

Clean cherry pick of PR #2864 In #2833, saved queues (eager and proto) were given the correct drain state, but new queues were given the `AllDrained` state meaning they could race before all existing queues have drained. This could happen if IO is received during draining on a CPU (thus queue) that has never seen IO before the save. This PR fixes the race by giving new queues the correct drain state - either `AllDrained` if all existing queues have drained already, or `SelfDrained`, like proto queues (no IO pending on them) if not.

alandau marked this pull request as ready for review February 24, 2026 23:28

alandau requested a review from a team as a code owner February 24, 2026 23:28

Copilot AI review requested due to automatic review settings February 24, 2026 23:28

alandau requested a review from a team as a code owner February 24, 2026 23:28

Copilot started reviewing on behalf of alandau February 24, 2026 23:29 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

mattkur reviewed Feb 24, 2026

View reviewed changes

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs Outdated Show resolved Hide resolved

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs Outdated Show resolved Hide resolved

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs Show resolved Hide resolved

gurasinghMS reviewed Feb 25, 2026

View reviewed changes

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs Show resolved Hide resolved

gurasinghMS reviewed Feb 25, 2026

View reviewed changes

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs Outdated Show resolved Hide resolved

alandau requested a review from Copilot February 25, 2026 23:47

Copilot AI reviewed Feb 25, 2026

View reviewed changes

Copilot started reviewing on behalf of alandau February 25, 2026 23:55 View session

alandau requested a review from Copilot February 26, 2026 00:27

Copilot started reviewing on behalf of alandau February 26, 2026 00:28 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs Show resolved Hide resolved

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs Outdated Show resolved Hide resolved

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs Show resolved Hide resolved

alandau requested a review from Copilot February 26, 2026 01:33

Copilot started reviewing on behalf of alandau February 26, 2026 01:34 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs Show resolved Hide resolved

Drain all queues after restore

7a5b3ea

alandau force-pushed the global-drain branch from 4eb5369 to 7a5b3ea Compare February 26, 2026 02:29

alandau changed the title ~~WIP: nvme_driver: Drain all queues after restore~~ nvme_driver: Drain all queues after restore Feb 26, 2026

gurasinghMS reviewed Feb 26, 2026

View reviewed changes

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs Show resolved Hide resolved

gurasinghMS reviewed Feb 26, 2026

View reviewed changes

mattkur previously approved these changes Feb 26, 2026

View reviewed changes

mattkur added the backport_1.7.2511 label Feb 26, 2026

mattkur requested a review from Copilot February 26, 2026 22:38

Copilot started reviewing on behalf of mattkur February 26, 2026 22:39 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

nits

64ec739

alandau dismissed mattkur’s stale review via 64ec739 February 26, 2026 23:18

alandau enabled auto-merge (squash) February 26, 2026 23:20

gurasinghMS approved these changes Feb 27, 2026

View reviewed changes

mattkur approved these changes Feb 27, 2026

View reviewed changes

alandau merged commit c1541ef into microsoft:main Feb 27, 2026
56 checks passed

mattkur mentioned this pull request Feb 27, 2026

nvme_driver: Drain all queues after restore (#2833) #2851

Merged

benhillis added backported_1.7.2511 PR that has been backported to release/1.7.2511 and removed backport_1.7.2511 labels Feb 27, 2026

alandau mentioned this pull request Feb 28, 2026

nvme_driver: Set correct drain state for new queues #2864

Merged

alandau mentioned this pull request Mar 2, 2026

nvme_driver: Set correct drain state for new queues (#2864) #2867

Merged

alandau deleted the global-drain branch March 2, 2026 23:41

Conversation

alandau commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattkur left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 26, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

gurasinghMS Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gurasinghMS commented Feb 26, 2026

Uh oh!

mattkur left a comment

Choose a reason for hiding this comment

Uh oh!

mattkur Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

mattkur Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

alandau commented Feb 26, 2026

Uh oh!

Uh oh!

benhillis commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

alandau commented Feb 24, 2026 •

edited

Loading