Http Stress Status Report
What we've run so far:
| OS |
HTTP 1.1 |
HTTP 2.0 |
Notes |
| Windows |
4+ h (Mana) -> 2 errors |
12 h (JJ) -> 11 errors |
|
|
6 h (Miha) -> 8 errors |
6 h (Miha) -> 7 errors |
|
|
10 h (Miha) -> 53 errors |
7 h (Miha) -> 139 errors |
7 h = 1 + 6 h |
| Linux |
17 h (JJ) -> 0 errors |
12 h (JJ) -> 608 errors |
we should rerun as this may be environmental problem |
|
12 h (Furtik) -> 0 errors |
12 h (Furtik) -> 104 errors |
ran on released runtime (not on master) |
HTTP 2.0 Error Statistics
| Error Type |
Linux |
Windows |
| Success |
135,727,755 |
105,273,552 |
| Errors |
712 |
157 |
System.Threading.Tasks.TaskCanceledException: The operation was canceled. |
37 |
8 |
System.Threading.Tasks.TaskCanceledException: A task was canceled. |
13 |
4 |
System.IO.IOException: The response ended prematurely while waiting for the next frame from the server. |
464 |
18 |
System.Net.Sockets.SocketException (32): Broken pipe |
198 |
0 |
System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host. |
0 |
118 |
System.Net.Sockets.SocketException (10053): An established connection was aborted by the software in your host machine. |
0 |
9 |
HTTP 1.1 Error Statistics
| Error Type |
Linux |
Windows |
| Success |
171,673,491 |
128,658,822 |
| Errors |
0 |
61 |
System.Net.Sockets.SocketException (10061): No connection could be made because the target machine actively refused it. |
0 |
61 |
What we need to run:
- More HTTP 1.1 Linux runs to confirm that we're clear. (easy, hi pri)
- More HTTP 2.0 Linux runs to confirm that we have all error types captured. (easy, hi pri)
- HTTP 2.0 tests against HTTPSys to eliminate/confirm server as the culprit. (mid, mid pri)
- Run the matrix against 3.1 and compare. (hard, mid pri)
Existing issues, root caused:
Discovered exceptions, not-investigated:
The discovered exceptions confirm what we've collected so far from the pipelines: #40388.
Distributable tasks by priority:
- More HTTP 1.1 Linux runs: http11run
- More HTTP 2.0 Linux runs: http20run
- Investigate Windows HTTP 1.1 errors: winErr3
- Investigate Windows HTTP 2.0 errors: winErr1, winErr2
- Provide fix for #42200
- Provide fix for #42198
- Help with HTTPSys client connection errors: httpSys: put on back-burner
- Run the tests against .NET Core 3.1: net31: put on back-burner
Tips and Tricks for investigations:
- Clear up docker containers and images after a product code change (
docker container prune && docker image prune -a)
- Once container is built, switch
-b might be omitted for subsequent re-runs (skips the runtime build)
- Don't use containers for investigations, they're slow and rebuild takes too long
- To run the app against locally built runtime, swap
artifacts/bin/testhost/net5.0-Linux-Debug-x64/shared/Microsoft.NETCore.App/6.0.0/ with the globally installed runtime (/usr/share/dotnet/shared/Microsoft.NETCore.App/your-latest-5.0-version)
- Make a backup of the global runtime!
- Using testhost's
corerun didn't work for me since the app depends on ASP .NET Core SDK
- If you change the product code, rebuild just
System.Net.Http and copy System.Net.Http.dll to the global runtime again
- Build
System.Net.Http/tests/StressTests/HttpStress
- Open 2 terminals and run:
- server:
dotnet run -runMode server -aspnetlog
-aspnetlog: console logging of server errors
-serverUri https://localhost:5002: bind to a different port (when running multiple tests in parallel)
- client:
dotnet run -runMode client
-serverUri https://localhost:5002: connect to a different port (when running multiple tests in parallel)
-ops 1 2 3: run only operation 1, 2 and 3 (GET, PUT Slow, etc...)
-trace: saves internal client/server traces in a log file, very verbose, useable only for very short runs
- more options in source code: https://github.com/dotnet/runtime/blob/master/src/libraries/System.Net.Http/tests/StressTests/HttpStress/Program.cs#L37-L62
If you have any improvements to the stress app or the containers, please create a PR and don't keep it just for yourself.
If you have more tips and tricks for running the tests, please share them.
Http Stress Status Report
What we've run so far:
HTTP 2.0 Error Statistics
The operation was canceled.
A task was canceled.
The response ended prematurely while waiting for the next frame from the server.
Broken pipe
An existing connection was forcibly closed by the remote host.
An established connection was aborted by the software in your host machine.
HTTP 1.1 Error Statistics
No connection could be made because the target machine actively refused it.
What we need to run:
Existing issues, root caused:
TaskCancelledExceptionas a reaction on GO_AWAY: HTTP/2 stress test TaskCanceledException when client hasn't cancelled #42472Discovered exceptions, not-investigated:
HTTP 2.0 System.Threading.Tasks.TaskCanceledException: The operation was canceled.HTTP/2 stress test TaskCanceledException when client hasn't cancelled #42472The discovered exceptions confirm what we've collected so far from the pipelines: #40388.
Distributable tasks by priority:
Tips and Tricks for investigations:
docker container prune && docker image prune -a)-bmight be omitted for subsequent re-runs (skips the runtime build)artifacts/bin/testhost/net5.0-Linux-Debug-x64/shared/Microsoft.NETCore.App/6.0.0/with the globally installed runtime (/usr/share/dotnet/shared/Microsoft.NETCore.App/your-latest-5.0-version)corerundidn't work for me since the app depends on ASP .NET Core SDKSystem.Net.Httpand copySystem.Net.Http.dllto the global runtime againSystem.Net.Http/tests/StressTests/HttpStressdotnet run -runMode server -aspnetlog-aspnetlog: console logging of server errors-serverUri https://localhost:5002: bind to a different port (when running multiple tests in parallel)dotnet run -runMode client-serverUri https://localhost:5002: connect to a different port (when running multiple tests in parallel)-ops 1 2 3: run only operation 1, 2 and 3 (GET, PUT Slow, etc...)-trace: saves internal client/server traces in a log file, very verbose, useable only for very short runsIf you have any improvements to the stress app or the containers, please create a PR and don't keep it just for yourself.
If you have more tips and tricks for running the tests, please share them.