Skip to content

Android: Attempt emulator recovery before returning DEVICE_NOT_FOUND#1550

Merged
steveisok merged 3 commits intomainfrom
copilot/attempt-emulator-restart-recovery
Mar 10, 2026
Merged

Android: Attempt emulator recovery before returning DEVICE_NOT_FOUND#1550
steveisok merged 3 commits intomainfrom
copilot/attempt-emulator-restart-recovery

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 6, 2026

When XHarness fails to find a suitable Android emulator, it previously returned DEVICE_NOT_FOUND immediately with no recovery attempt. On Helix CI machines with systemd-managed emulators, a crashed x86_64 emulator would cause all 3 retry attempts to fail on the same machine.

Changes

AdbRunner — new TryRecoverEmulator() method

Multi-step recovery sequence invoked before giving up:

  1. Diagnostics: logs adb devices -l output and (on Linux) systemctl status android-emulator
  2. ADB reset: kill-server + start-server; if the device reappears, returns immediately
  3. Emulator restart: systemctl restart android-emulator on Linux/systemd, falling back to adb emu restart
  4. Wait: polls for device appearance (up to 5 min), then waits for sys.boot_completed == 1

AdbRunner — improved diagnostics on architecture mismatch

GetAllDevices() now logs which devices were found and their architectures when the required architecture isn't satisfied:

No attached device supports one of required architectures: x86_64.
Found 1 device(s) with: emulator-5554=[x86]

Command updates — recovery before failing

  • AndroidInstallCommand.InvokeHelper — calls TryRecoverEmulator() and retries GetDevice() before throwing NoDeviceFoundException
  • AndroidHeadlessInstallCommand.InvokeHelper — same pattern
  • AndroidRunCommand.InvokeCommand — same pattern via GetSingleDevice()
  • TimeToWaitForBootCompletion is now set before device discovery so the configured timeout applies during recovery as well

Tests

Added two tests covering recovery paths: device reappears after ADB daemon reset, and device reappears only after emulator restart.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • netcorenativeassets.blob.core.windows.net
    • Triggering command: /home/REDACTED/work/xharness/xharness/.dotnet/dotnet /home/REDACTED/work/xharness/xharness/.dotnet/dotnet /home/REDACTED/work/xharness/xharness/.dotnet/sdk/10.0.100/MSBuild.dll /noautoresponse /nologo /nodemode:1 /nodeReuse:true /low:false (dns block)
  • securitytools.pkgs.visualstudio.com
    • Triggering command: /opt/hostedtoolcache/CodeQL/2.24.2/x64/codeql/csharp/tools/linux64/Semmle.Autobuild.CSharp /opt/hostedtoolcache/CodeQL/2.24.2/x64/codeql/csharp/tools/linux64/Semmle.Autobuild.CSharp (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Android: Attempt emulator restart/recovery before returning DEVICE_NOT_FOUND</issue_title>
<issue_description>## Summary

When XHarness cannot find a suitable Android emulator (exit code 81 / DEVICE_NOT_FOUND), it currently does a single-shot check and immediately gives up. It should attempt to restart the emulator and retry before failing.

Current Behavior

  1. AdbRunner.GetDevice() queries adb devices -l
  2. If no device matches the required architecture (e.g. x86_64), returns null
  3. WaitForDevice() may wait for boot completion, but only if a device was found
  4. Returns ExitCode.DEVICE_NOT_FOUND (81) with no recovery attempted

Helix retries the work item up to 3 times, but retries run on the same machine with the same dead emulator. The machine reboot only fires after all attempts are exhausted.

Proposed Behavior

Before returning DEVICE_NOT_FOUND, XHarness should attempt recovery:

  1. Reset the ADB daemon (adb stop-server then adb start-server)
  2. Re-check adb devices -- if the emulator reappears, continue
  3. If still missing, attempt to restart the emulator process via systemctl restart android-emulator (for systemd-managed emulators on Helix machines) or adb emu restart as a fallback
  4. Wait for boot completion (sys.boot_completed == 1) with a reasonable timeout
  5. Re-check for the required device
  6. Only return DEVICE_NOT_FOUND if all recovery attempts fail

Additionally, better diagnostics before failing would help:

  • Log output of adb devices -l showing what IS available
  • Log emulator process status (systemctl status)
  • Log the specific architecture mismatch (needed x86_64, found only x86)

Evidence

Over the last 2 days in the dotnet/runtime CI pipeline (definition 129), there were 29 DEVICE_NOT_FOUND failures across the android-x64 Release AllSubsets_CoreCLR_Smoke leg, spread across 17 different machines. The common pattern:

  • Machine has two emulators: emulator-5554 (x86, API 29) and emulator-5556 (x86_64, API 29)
  • The x86_64 emulator (emulator-5556) crashes or fails to start
  • XHarness finds only the x86 emulator, which does not match the x86_64 requirement
  • Exits immediately with code 81
  • All 3 Helix retry attempts fail because the emulator stays dead

Example failing work item: System.Security.Cryptography.Tests in Helix job ced47868-669f-4428-b0e8-ea795af7b0c3 on machine a003BI4. Only emulator-5554 (x86) was found, emulator-5556 (x86_64) was missing.

Console log: https://helix.dot.net/api/2019-06-17/jobs/ced47868-669f-4428-b0e8-ea795af7b0c3/workitems/System.Security.Cryptography.Tests/console

Impact

Affects multiple test suites on the android-x64 CoreCLR Smoke leg: System.Security.Cryptography.Tests (~10x), Android.Device_Emulator.JIT tests (~4x), System.Diagnostics.Tracing.Tests (~6x), and others. These all share the same root cause of a crashed x86_64 emulator with no recovery path.

Related: #1548 (tvOS device log stream blocking issue)
</issue_description>

Comments on the Issue (you are @copilot in this section)


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Copilot AI and others added 2 commits March 6, 2026 01:49
…empt recovery before DEVICE_NOT_FOUND

Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>
Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>
@steveisok steveisok enabled auto-merge (squash) March 10, 2026 10:40
@steveisok
Copy link
Copy Markdown
Member

/ba-g Known issues

@steveisok steveisok merged commit e7a795d into main Mar 10, 2026
15 of 17 checks passed
@steveisok steveisok deleted the copilot/attempt-emulator-restart-recovery branch March 10, 2026 10:41
matouskozak pushed a commit to matouskozak/xharness that referenced this pull request Mar 11, 2026
…otnet#1550)

* Initial plan

* Add emulator recovery to AdbRunner and update Android commands to attempt recovery before DEVICE_NOT_FOUND

Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>

* Android: Attempt emulator recovery before returning DEVICE_NOT_FOUND

Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>
kotlarmilos pushed a commit to kotlarmilos/xharness that referenced this pull request Mar 16, 2026
…otnet#1550)

* Initial plan

* Add emulator recovery to AdbRunner and update Android commands to attempt recovery before DEVICE_NOT_FOUND

Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>

* Android: Attempt emulator recovery before returning DEVICE_NOT_FOUND

Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>
kotlarmilos pushed a commit to kotlarmilos/xharness that referenced this pull request Mar 16, 2026
…otnet#1550)

* Initial plan

* Add emulator recovery to AdbRunner and update Android commands to attempt recovery before DEVICE_NOT_FOUND

Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>

* Android: Attempt emulator recovery before returning DEVICE_NOT_FOUND

Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: steveisok <471438+steveisok@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Android: Attempt emulator restart/recovery before returning DEVICE_NOT_FOUND

3 participants