Conversation
635e879 to
6f8918e
Compare
|
@bmah888 I have change the PR to use The UDP throughput enhancement achieved for high throughput interfaces is quite substantial, especially when the UDP messages are not large (witch is usually the case). |
|
@bmah888 Any plan to have this PR reviewed and merged? |
| i = 0; /* count of messages sent */ | ||
| r = 0; /* total bytes sent */ | ||
| while (i < sp->sendmmsg_buffered_packets_count) { | ||
| j = sendmmsg(sp->socket, &sp->msg[i], sp->sendmmsg_buffered_packets_count - i, MSG_DONTWAIT); |
There was a problem hiding this comment.
Before each sendmmsg(2) call, you should poll for socket write readiness. In my tests, this can significantly improve performance.
There was a problem hiding this comment.
Can you add more details:
- How do you do the polling?
- From your experience, what should be done if the socket is not ready for write? The current design of the code is that the function does not return before all was sent (or there was an error). Do you suggest that in case the socket is not ready for write the function will return successfully, but without sending anything or before all was sent?
- Do you understand why the method you suggest improve performance? I am asking since in any case, iperf3 will retry sending.
Thanks
There was a problem hiding this comment.
Do you understand why the method you suggest improve performance?
In my case, I was doing raw syscalls in my Go program. The UDP socket opened by the Go runtime is in non-blocking mode. With sendmsg(2), if the sending operation was going to block, sendmsg(2) would return -EAGAIN or -EWOULDBLOCK, which is handled by the Go runtime to poll for socket write readiness with epoll. The calling goroutine can then be parked by the runtime to free the OS thread. (My limited understanding of Go internals might be inaccurate.)
Now with sendmmsg(2), according to the manual, a nonblocking call sends as many messages as possible (up to the limit specified by vlen) and returns immediately. By treating a non-complete return value the same way as -EAGAIN and -EWOULDBLOCK, that is, instead of immediately calling sendmmsg(2) again, I instructed the Go runtime to poll for write readiness before the next sendmmsg(2) call. This change yielded a 10% increase in throughput.
The current design of the code is that the function does not return before all was sent (or there was an error).
I'm not familiar with iperf3's code base. I just read some code, and it seems to me that iperf3 uses sockets in blocking mode for UDP tests. In this case, maybe it's better to simply drop the MSG_DONTWAIT flag, sendmmsg(2) would then only return when all messages have been sent. This saves even more syscall overhead.
There was a problem hiding this comment.
@database64128, thanks a lot for the detailed explanation.
I will have to check how easy it is to implement this. To minimize iper3 design changes, the approach I took for sendmmsg is to accumulate packets iperf3 is sending and send them in bursts using sendmmsg. It may be that instead of the for loop, sendmmsg can be called once. In this case all the packets that were not sent can either be moved to the beginning of the buffer or ignored. (The issue with ignoring is that the packets are numbered, so the new packets numbering should start from the last successful packet sent.)
There was a problem hiding this comment.
Any update on this? Improved UDP send/recv performance on iperf would be very helpful!
|
I have rebased using version 3.20+. However, the previous version was pre-3.16 and did not use threads for sending/receiving the data. Now, since threads are used, the CPU is the limiting factor on my computer, and I get the same results for UDP with or without |
Version of iperf3 (or development branch, such as
masteror3.1-STABLE) to which this pull request applies:3.10.1 latest master
Issues fixed (if any):
UDP throughput issue #873
Brief description of code changes (suitable for use as a commit message):
Add
sendmmsgsupport for sending UDP messages for enhanced throughput.sendmmsgis used by setting the-Zoption (which is currently used only for TCP), as it is regarded as the UDP's alternative to TCP's zero copy.The number of packets that are send by each call to
sendmmsgis theburstsize set by the-boption.Note:
configure.acwas changed so runningbootstrap.sh; configureis required for the changes to take effect. (New defines areHAVE_SENDMMSG,HAVE_RECVMMSGandHAVE-SEND_RECVMMSG.)recvmmsgis not used because tests showed does not help the throughput and event may hart it. However, the changes for testingrecvmmsgare commented out iniperf_udp_recv()and not removed in case further evaluation is desired. If this is not the case, then all changes tpiperf_udp_recvcan be removed.