Skip to content

Conversation

@davidBar-On
Copy link
Contributor

  • Version of iperf3 (or development branch, such as master or
    3.1-STABLE) to which this pull request applies:
    master

  • Issues fixed (if any): none

  • Brief description of code changes (suitable for use as a commit message):

While evaluation Issue #1986 and simulating disconnect between the client and server, the server crashed after the call to server_timer_proc(). Using gdb I found that the problem was that the sp->test address in the data thread was garbled. I believe that this was because of race condition between the main and data threads:

  1. server_timer_proc() closed the data stream and freed the sp buffers.
  2. The sp buffer was used for something else, garbling the test address` (if this step happens later, depending on the threads race conditions, the crash will not happen).
  3. Since the data stream was closed, the data thread read()/recv() returned 0 - end of file.
  4. iperf_tcp_recv() in the data thread performs the if (sp->test->state == TEST_RUNNING) test that cause the server crash because the sp->test is garbled.

The suggested fix is that server_timer_proc() will only close the control socket, but not the data streams. This causes the select() to fail and then the threads/sockets are closed in the usual way after failure.

Another approach was that server_timer_proc() will close the threads before closing the streams, but I think just closing the control channel is better.

Also, I noted that server_timer_proc() does not issue any error message. I didn't add such message as I am not sure if this is not intentional.

@bmah888 bmah888 changed the title renove data sockets closing from server_timer_proc as it caused threa… Remove data sockets closing from server_timer_proc to avoid crash Jan 8, 2026
@swlars swlars force-pushed the server_timer_proc__remove_clocing_data_sockets branch from a174f4f to 983d7c6 Compare January 9, 2026 18:59
@swlars
Copy link
Contributor

swlars commented Jan 12, 2026

Hi David! Thanks for the pull request. This looks like an important fix. How were you testing the original code to see the crash?

@davidBar-On
Copy link
Contributor Author

How were you testing the original code to see the crash?

Hi, while trying to recreate the test, I found a more straightforward issue that is caused by the same problem and does not depend on race conditions. Before the following lines add sleep(45):

/* Yes, done! Send TEST_END. */
test->done = 1;

When running TCP test, after 40 seconds at that point (the test end), the server timeout and starts a new test, but without terminating the data threads. After the client's sleep ends it terminates, which cause the data thread and the server to crash at:

iperf/src/iperf_tcp.c

Lines 75 to 78 in 0ae94b6

if (sp->test->state == TEST_RUNNING) {
sp->result->bytes_received += r;
sp->result->bytes_received_this_interval += r;
}

Again, because the test pointer is garbled.

This PR also fixes this problem (as it is caused by the same issue described in this PR).

@swlars
Copy link
Contributor

swlars commented Jan 23, 2026

Thanks! That worked for me! Confirmed, good catch!

@swlars swlars merged commit 6b76500 into esnet:master Jan 23, 2026
7 checks passed
WatcherOfTheSkies added a commit to WatcherOfTheSkies/iperf that referenced this pull request Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants