Skip to content

Fix socket race condition in port tracking#40187

Merged
benhillis merged 11 commits intomasterfrom
user/chemwolf6922/fix-potential-socket-race-condition-in-port-tracking
Apr 22, 2026
Merged

Fix socket race condition in port tracking#40187
benhillis merged 11 commits intomasterfrom
user/chemwolf6922/fix-potential-socket-race-condition-in-port-tracking

Conversation

@chemwolf6922
Copy link
Copy Markdown
Contributor

@chemwolf6922 chemwolf6922 commented Apr 15, 2026

Summary of the Pull Request

This PR makes the bind 0 port resolution in port tracking inline. To avoid duplicating the socket and causing race conditions. With the trade off of even slower bind calls.

PR Checklist

  • Closes: VSCode Remote-SSH SSH tunnel stuck on WSL2 2.7.1 #40109
  • Communication: I've discussed this with core contributors already. If work hasn't been agreed, this work might be rejected
  • Tests: Added/updated if needed and all pass
  • Localization: All end user facing strings can be localized
  • Dev docs: Added/updated if needed
  • Documentation updated: If checked, please file a pull request on our docs repo and link it here: #xxx

Detailed Description of the Pull Request / Additional comments

#40178 would be a better but much more complex solution. Which may not be worth it if the seccomp part is still required.

Validation Steps Performed

Existing tests
NetworkTests::MirroredTests::PortZeroBindIsTracked [Passed]
NetworkTests::VirtioProxyTests::PortZeroBindIsTracked [Passed]

New tests
NetworkTests::MirroredTests::PortZeroRebindSucceeds [Passed]
NetworkTests::VirtioProxyTests::PortZeroRebindSucceeds [Passed]

The problem described in #40109 is reproduceable. I have validated that this PR will fix the issue.
I'm able to trace the vs code server issue's bind calls. And the pattern matches the assumption when it fails without the fix:

[bind] pid=4356   comm=code             family=2    port=0
[ret ] pid=4356   comm=code             ret=0    resolved_port=39748
[bind] pid=4356   comm=code             family=2    port=39748
[ret ] pid=4356   comm=code             ret=-98  resolved_port=39748

@chemwolf6922 chemwolf6922 requested a review from a team as a code owner April 15, 2026 08:35
Copilot AI review requested due to automatic review settings April 15, 2026 08:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR removes the background thread used to resolve port-0 (ephemeral) binds and instead resolves them inline in the main GnsPortTracker::Run() loop to avoid duplicating sockets and triggering race conditions.

Changes:

  • Removed deferred port-0 resolution thread/queues and related synchronization primitives.
  • Made ResolvePortZeroBind return std::optional<PortAllocation> and resolve/register port-0 binds synchronously.
  • Simplified port-0 bind processing by directly calling HandleRequest/TrackPort after resolution.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/linux/init/GnsPortTracker.h Removes deferred-resolution thread machinery; updates ResolvePortZeroBind signature to return an optional allocation.
src/linux/init/GnsPortTracker.cpp Deletes deferred resolver thread/queues; resolves port-0 binds inline and registers allocations immediately.

Comment thread src/linux/init/GnsPortTracker.cpp
Comment thread src/linux/init/GnsPortTracker.cpp
Comment thread src/linux/init/GnsPortTracker.cpp
@benhillis
Copy link
Copy Markdown
Member

Ideally there would be a test that fails before this change and passes afterwards. Is that possible to add?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings April 20, 2026 07:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comment thread test/linux/helpers/port_rebind_test.c Outdated
Comment thread test/windows/NetworkTests.cpp Outdated
Comment thread test/windows/NetworkTests.cpp Outdated
Copilot AI review requested due to automatic review settings April 20, 2026 08:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment thread src/linux/init/GnsPortTracker.cpp
Comment thread test/linux/helpers/port_rebind_test.c Outdated
@benhillis
Copy link
Copy Markdown
Member

@chemwolf6922 - this change looks ok to me, but the PortZeroRebindSucceeds tests are failing. Can you investigate?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings April 21, 2026 03:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

…n-port-tracking' of https://github.com/microsoft/WSL into user/chemwolf6922/fix-potential-socket-race-condition-in-port-tracking
Copilot AI review requested due to automatic review settings April 22, 2026 15:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Collaborator

@OneBlue OneBlue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change LGTM. Unfortunately I think a 250ms timeout will lead to inconsistencies if the machine is under pressure, but increasing the timeout too much will negatively affect performance.

Like you said I think the right long term fix is #40178, in the meantime this will be good to fix the immediate double-bind issue

@benhillis
Copy link
Copy Markdown
Member

Change LGTM. Unfortunately I think a 250ms timeout will lead to inconsistencies if the machine is under pressure, but increasing the timeout too much will negatively affect performance.

Like you said I think the right long term fix is #40178, in the meantime this will be good to fix the immediate double-bind issue

I agree, I think this change is worth taking now and we can spend some time switching over to EBPF hopefully soon.

@benhillis benhillis merged commit 3f00f98 into master Apr 22, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants