net: Revert unconditional tcp nodelay#30470
Open
StephanDollberg wants to merge 3 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Reverts the previously-introduced unconditional TCP_NODELAY behavior in net::base_transport (which impacted all clients, including internal RPC) and limits TCP_NODELAY enablement to the HTTP client path to avoid the observed throughput regression.
Changes:
- Remove unconditional
fd.set_nodelay(true)fromnet::base_transport::do_connect. - Add
net::base_transport::set_nodelay(bool)to allow opt-in socket configuration after connection establishment. - Enable
TCP_NODELAYfor HTTP connections by callingset_nodelay(true)afterget_connected()succeeds.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/v/net/transport.h |
Adds set_nodelay(bool) to the base transport interface for opt-in use by specific clients. |
src/v/net/transport.cc |
Removes unconditional TCP_NODELAY during connect and implements base_transport::set_nodelay. |
src/v/http/client.cc |
Opts HTTP client connections into TCP_NODELAY after establishing a connected socket. |
travisdowns
reviewed
May 19, 2026
|
|
||
| void set_keepalive_parameters(const ss::net::keepalive_params& params); | ||
| void set_keepalive(bool); | ||
| void set_nodelay(bool); |
travisdowns
previously approved these changes
May 19, 2026
1aeea6d to
d7759a2
Compare
This reverts commit f3169e6. The commit set TCP_NODELAY on the base client and hence it affected all forms of clients. This includes the internal RPC client. In CPD we saw a throughput performance regression: ``` RedPandaOpenMessagingBenchmarkPerf pub50pct -1% RpkBenchmarkPerf requests_per_sec -7% ``` Looking at metrics it becomes clear why. Avg reactor util is up ~5%. recvfrom syscalls are up ~20%. The delayed sending causes implicit batching on the TCP level when multiple append entry requests are in flight at the same time. This reduces recv calls on the followers and probably also has higher efficiency at the application level (even if the amount of actual append entry requests is unchanged or even increased). While this sort of implicit reliance on NODELAY isn't great the performance impact can't be neglected. Hence revert. Longer team we'd probably want to replace this with more explicit batching.
After previously enabling it unconditionally and then having to revert it in 55d7346 this patch adds the API such that sub-classes can enable it selectively.
HTTP doesn't allow for batching in the protocol (no one uses pipelining) and hence it's effectively always bad to use it for HTTP especially where client code might have nagle-bad patterns like send(headers) -> send(body).
d7759a2 to
eba85eb
Compare
Collaborator
CI test resultstest results on build#84873
|
travisdowns
approved these changes
May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This reverts commit f3169e6.
The commit set TCP_NODELAY on the base client and hence it affected all
forms of clients.
This includes the internal RPC client.
In CPD we saw a throughput performance regression:
Looking at metrics it becomes clear why. Avg reactor util is up ~5%.
recvfrom syscalls are up ~20%.
The delayed sending causes implicit batching on the TCP level when
multiple append entry requests are in flight at the same time. This
reduces recv calls on the followers and probably also has higher
efficiency at the application level (even if the amount of actual append
entry requests is unchanged or even increased).
While this sort of implicit reliance on NODELAY isn't great the
performance impact can't be neglected. Hence revert.
Longer team we'd probably want to replace this with more explicit
batching.
Set it only for HTTP.
Backports Required
Release Notes