Conversation
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Code Review
This pull request introduces the capability to simulate NUMA nodes in QEMU for external tests. The changes involve adding a NumaNodes option and plumbing it through various layers of the test harness and platform configuration. The core logic for generating QEMU arguments for NUMA is in mantle/platform/qemu.go.
I've identified a couple of issues in the implementation within mantle/platform/qemu.go that could lead to incorrect resource allocation for the simulated NUMA nodes. My review includes suggestions to fix these bugs.
ravanelli
left a comment
There was a problem hiding this comment.
Great job, it is looking good, let's squash the commits and follow the suggestion to simply it with using on/off with 2 NUMA nodes
52601f4 to
774b30c
Compare
|
/test all |
5590322 to
aafd091
Compare
mantle/platform/qemu.go
Outdated
|
|
||
| ret = append(ret, "-object", fmt.Sprintf("memory-backend-memfd,id=%s,size=%dM,share=on", node0MemoryDevice, node0Memory)) | ||
| ret = append(ret, "-object", fmt.Sprintf("memory-backend-memfd,id=%s,size=%dM,share=on", node1MemoryDevice, memoryMiB-node0Memory)) | ||
| ret = append(ret, "-numa", fmt.Sprintf("node,memdev=%s,cpus=%d-%d,nodeid=0", node0MemoryDevice, 0, node0Cpus-1)) |
There was a problem hiding this comment.
What does 0, does here? That the second does not need it?
There was a problem hiding this comment.
I'm guessing you used this doc? https://www.qemu.org/docs/master/system/invocation.html#cmdoption-smp
Can you add the link for it, so we can understand what this code is doing?
Also, why you did not used the socket option? -numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
There was a problem hiding this comment.
That is the docs I used, yes. Should I add the link as a comment in the codebase, or just to this PR ?
There was no big reason why I didnt use the socket flag. Would you prefer it be used when defining the nodes ?
The 0 is just the starting CPU for node0. The reason it isn't there for the other node is that they are both being given different CPUs. Right now, if your machine has 4 cpus, node0 would be given cpus=0-1 and node1 would be given cpus=2-3.
Maybe it would simplify the code if we hard coded it to be 2 cpus, one for each node, or even simpler if we gave node 0 one cpu, and node 1 only has memory.
There was a problem hiding this comment.
Should I add the link as a comment in the codebase
Yes, add it as a comment so, everyone can understand what it is doing
The 0 is just the starting CPU for node0. The reason it isn't there for the other node is that they are both being given different CPUs. Right now, if your machine has 4 cpus, node0 would be given cpus=0-1 and node1 would be given cpus=2-3.
And what if we have 128 cpus? Where is set the limit?
For for the approach bellow, we do have the limit. It seems to be a more precise way.
-smp 8,sockets=2,cores=2,threads=2,maxcpus=8
Maybe we do need to set sockets and such, but at least a limit like -smp 4 or even 8, something like:
ret = append(ret, "-smp", "8")
ret = append(ret, "-numa", fmt.Sprintf("node,memdev=%s,cpus=%d-%d,nodeid=0", node0MemoryDevice, 0, node0Cpus-1))
There was a problem hiding this comment.
umm, to be honestly, that's maybe not worth since we arealdy pass the cpu as part of the qemu test/emulation, yeah, we can skip the -smp
There was a problem hiding this comment.
Currently, the -smp flag is currently being set to whatver integer builder.Processors is set to. If this has not been defined, then it will be set to 2 by default for NUMA tests, and to 1 if it is not a NUMA test. So by default, its just 2 CPUs.
If builder.Processors has been set to be more than 2 processors, then the nodes will try and split them evenly.
ravanelli
left a comment
There was a problem hiding this comment.
@angelcerveraroldan can you also create the actually Numa test to be added in the kola tests? So we can validate the test?
e236364 to
1758cc5
Compare
Add a new boolean flag "numaNodes" allowing QEMU simulation of two NUMA nodes.
1758cc5 to
25c7dab
Compare
ravanelli
left a comment
There was a problem hiding this comment.
Nice job! Thanks for working on it!
/lgtm
Add an option for external tests to be executed by a machine with multiple numa nodes.