Skip to content

Cannot test long UDP sessions #812

@teifip

Description

@teifip

Context

Server side
iperf 3.6 (cJSON 1.5.2) on Ubuntu 16.04.1 (built from source code)

Two servers running concurrently on the same n1-highcpu-2 (2 vCPUs, 1.8 GB memory) VM on Google Cloud Platform:
iperf3 -s -p5201 (used to test server to client path)
iperf3 -s -p5202 (used to test client to server path)

Client side
iperf 3.6 (cJSON 1.5.2) on Windows 10 (binary obtained from here)

Two clients running concurrently on the same i&-7820HQ @ 2.90GHz machine, one in normal mode and one in reverse mode:
iperf3 -c X.X.X.X -u -b1M -p5201 -R -t7400 (used to test server to client path)
iperf3 -c X.X.X.X -u -b1M -p5202 -t7400 (used to test client to server path)

Bug Report

The tests conclude with the same error on both servers:

[  5] 7403.00-7404.00 sec   123 KBytes  1.00 Mbits/sec  86
iperf3: error - select failed: Bad file descriptor
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
[  5] 7403.00-7404.00 sec  0.00 Bytes  0.00 bits/sec  0.039 ms  0/0 (0%)
iperf3: error - select failed: Bad file descriptor
-----------------------------------------------------------
Server listening on 5202
-----------------------------------------------------------

And both clients report a connection reset:

[  5] 7398.00-7399.00 sec   123 KBytes  1.00 Mbits/sec  0.199 ms  0/86 (0%)
iperf3: error - unable to receive control message: Connection reset by peer
[  5] 7398.00-7399.00 sec   121 KBytes   992 Kbits/sec  85
iperf3: error - unable to receive control message: Connection reset by peer

This problem is systematic with test duration of 2 hours or above, while I have not noticed the same problem with shorter tests, such as one hour. However, I cannot say that these observations are conclusive.

This situation situation is particularly severe when option -J is used and the client in normal mode is launched with the --get-server-output option. Upon experiencing the problem, the server:

  • Does not produce the JSON output;
  • Does not pass the results to the client.

Therefore, the statistics at the server side are completely lost.

Notes

I see that there are past issues related to cases when the test duration at the server side appears longer than at the client side like above. Where they supposed to be resolved in v3.6?

Issue #735 has definitely some commonalities with this. In particular, it stands out that also in my case the server error occurs when the test duration at the server side gets 5s longer than what requested by the client. However, in my case, the connection under test is quite fast.

Ping statistics for X.X.X.X:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 12ms, Maximum = 13ms, Average = 12ms

There is simply no way that it can hold 4-5 seconds of packets in flight. Why the duration at the server and client side appear different?

Is there any reason why I may see different behavior when the test duration gets longer, say close to 2 hours? I mean, are there timeouts that may affect the TCP control connection while the tests run on UDP? Or is it some sort of timing skew between client and server that gets accumulated?

Thanks!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions