Skip to content

Process.start leaks orphaned child processes when parent fails after fork() on macOS+Linux #62781

@zafnz

Description

@zafnz

Summary

On macOS (and probably Linux), Process.start can leak orphaned child processes that sleep forever.
When ProcessStarter::Start() encounters an error after fork() but before signaling the child to
proceed, the cleanup path (CleanupAndReturnError) closes pipes but never kills the forked child. The
child blocks forever in read(), waiting for a "go" signal that will never arrive.

This is exacerbated by a secondary issue: the child inherits the write end of its own signal pipe,
so even when the parent closes its copy, the pipe doesn't break, the child's read() never returns EOF.

This issue was observed in production conditions in a Flutter app.

Environment

  • macOS 15.3 (arm64)
  • Dart SDK version: 3.12.0-edge.b87daa7351cc09d136ff4e4ed6027f4fd2501494 (main) (Thu Feb 26 16:40:23 2026 -0800) on "macos_arm64"

Root Cause

In runtime/bin/process_macos.cc, ProcessStarter::Start() follows this sequence:

CreatePipes()          // creates ~8 FDs (4 pipes)
fork()                 // child blocks in ReadFromBlocking(read_in_[0])
RegisterProcess(pid)   // creates 2 more FDs (event pipe) - CAN FAIL
WriteToBlocking(...)   // signals child to proceed - CAN FAIL

If RegisterProcess() fails (e.g., pipe() returns EMFILE under FD pressure), or if the write fails for any reason, the parent calls CleanupAndReturnError():

int CleanupAndReturnError() {
  int actual_errno = errno;
  if (actual_errno == 0) actual_errno = EPERM;
  SetChildOsErrorMessage();
  CloseAllPipes();        // closes parent's pipe FDs
  return actual_errno;    // child is never killed
}

The child is left alive. It's stuck in:

void NewProcess() {
  char msg;
  int bytes_read = FDUtils::ReadFromBlocking(read_in_[0], &msg, sizeof(msg));
  // ... never reached if parent doesn't write
}

Why the child can't self-rescue

After fork(), the child inherits all of the parent's file descriptors, including read_in_[1] (the write end of the signal pipe). The child calls read(read_in_[0]) without first closing read_in_[1]. Even after the parent closes its copy of read_in_[1] via CloseAllPipes(), the child's own copy keeps the write end open. The pipe never breaks, so read() blocks indefinitely rather than returning EOF.

Impact

Each failed Process.start call leaks one child process. Under FD pressure (many open files, filesystem watchers, subprocess pipes), failures can cascade the leaked children themselves hold FDs (inherited from the parent), reducing even less available FDs and causing more Process.start failures in a runaway loop. (weeee)

In my production case, a Flutter desktop app managing many git commands hit the FD limit. A single burst
of git operations produced 166 orphaned child processes (lol), all sleeping in read():

$ ps -o pid,ppid,stat,lstart | grep <parent_pid> | head -5
42317 14126 S+   Fri 27 Feb 12:08:10 2026
42318 14126 S+   Fri 27 Feb 12:08:10 2026
42319 14126 S+   Fri 27 Feb 12:08:10 2026
42320 14126 S+   Fri 27 Feb 12:08:10 2026
42391 14126 S+   Fri 27 Feb 12:08:10 2026

Confirmed via debugger

Attaching lldb to an orphaned child confirms it is stuck in the Dart VM's post-fork signal wait:

* thread #1, stop reason = signal SIGSTOP
  frame #0: libsystem_kernel.dylib`read + 8
  frame #1: FlutterMacOS`dart::bin::FDUtils::ReadFromBlocking + 100
  frame #2: FlutterMacOS`dart::bin::ProcessStarter::Start() + 168

Suggested Fix

Creating a PR shortly, but essentially the fix is to swap the order RegisterProcess first, then ProcessStarted(), which is safe because the child can't exit while it's blocked on read(). Now if RegisterProcess fails, we simply kill+waitpid the child with no exit handler involvement.

And just to be safe, the child also closes the write end of the signal pipe before read(), so if the parent closes its end for any reason, the child gets EOF and exits instead of blocking forever.

Relationship to SIGPIPE issue

This is related to but distinct from earlier dart SIGPIPE issue. (See also flutter#182436)

The SIGPIPE fix (initializing VM signal handlers, or signal(SIGPIPE, SIG_IGN) as a workaround) prevents
the parent from crashing. But the orphaned child leak is a separate problem in the error cleanup
path that persists regardless of SIGPIPE handling. I don't believe a fix for that issue fixes this one.

Exit crash

After reproducing the bug, the Dart VM crashes on exit with an assertion failure attempting to wait for the orphaned children:

process_macos.cc: 230: error: Wait for process exit failed: 10

This is ECHILD the orphaned children are still running but their relationship to the parent is broken, so
waitpid() fails. The process hangs on exit because the VM's process tracking expects to clean up children
that it can no longer wait on.

Reproduction

See the attached Dart script (process_start_orphan_repro.dart) which exhausts file descriptors and
demonstrates the leaked child processes.

$ ulimit -n 256 && dart run process_start_orphan_repro.dart
Parent PID: 12345
Exhausting file descriptors...
Opened 204 files until hitting ulimit.

Freed 8 FDs. CreatePipes should succeed, RegisterProcess should fail.

Attempting Process.start calls...
  #0: ProcessException: Too many open files
  #1: ProcessException: Too many open files
  #2: ProcessException: Too many open files
  #3: ProcessException: Too many open files
  #4: ProcessException: Too many open files

Checking for orphaned children of PID 12345...
LEAKED CHILD PROCESSES:
  PID 12400  state: S+  (sleeping - stuck in read())
  PID 12401  state: S+  (sleeping - stuck in read())
  PID 12402  state: S+  (sleeping - stuck in read())
  PID 12403  state: S+  (sleeping - stuck in read())
  PID 12404  state: S+  (sleeping - stuck in read())
  PID 12405  state: S+  (sleeping - stuck in read())

BUG CONFIRMED: 6 orphaned child processes leaked from 5 failed Process.start calls
Also we hang on exit
process_macos.cc: 230: error: Wait for process exit failed: 10

Note: The script produces 6 orphans from 5 attempts because CleanupAndReturnError( closes 8 pipe FDs after
each failure, freeing enough FDs for the next CreatePipes+fork to succeed before RegisterProcess fails
again creating a new orphan each cycle.

process_start_orphan_repro.dart.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3A lower priority bug or feature requestarea-vmUse area-vm for VM related issues, including code coverage, and the AOT and JIT backends.library-iotriagedIssue has been triaged by sub team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions