fix: Reduce default AprilTag threads to 2 (Issue #2125)#2268
fix: Reduce default AprilTag threads to 2 (Issue #2125)#2268Shubh3005 wants to merge 1 commit intoPhotonVision:mainfrom
Conversation
|
Latency is also impacted by processing time, camera FPS, and the exact settings of the AprilTag detector. On a Orange Pi 5, I'm seeing that 3 threads is actually the sweet spot, for my set of settings. 2 threads has a mean latency of 33 ms, while 3 has a mean latency of 29 ms, which I would say is significant (and it looks like you do too). I'd like to see benchmarks on platforms that are commonly used (notably, this doesn't include Macs of any sort.) |
|
Thanks for running those benchmarks on the Orange Pi 5! That 4ms latency improvement (33ms -> 29ms) with 3 threads is definitely significant on that hardware. One concern regarding the default: The Orange Pi 5 has 8 cores, so running 3 worker threads leaves plenty of headroom. However, the Raspberry Pi 4 (standard FRC coprocessor) only has 4 cores. If we default to 3 threads, we might saturate a Pi 4 (3 workers + 1 OS/NetworkTables/Driver thread), re-introducing the starvation/jitter issue. Should the default be optimized for: Performance (3 Threads) - Best for OPi5 / Mini PCs. Safety (2 Threads) - Best for RPi 3/4 to ensure system stability. I am happy to update the PR to 3 threads if you think the Pi 4 can handle the load, but I wanted to flag the core-count difference first. |
|
How do you know that 3 threads saturates a Raspberry Pi 4? It might; it has a very weak processor compared to the Orange Pi 5, but these threads aren’t saturating the entire core, like you’re assuming they will. More benchmarks on a variety of hardware with a variety of settings need to be done before a conclusion can be drawn. |
|
I decided to test single camera performance on the Raspberry Pi 4b too. I did this test with the resolution of 1280x800 to try to get the highest CPU load. Here are the results for decimate of 1 and decimate of 2: For a single camera, FPS is maximized with 4 threads for either decimate setting. The highest CPU load was less than 75% for the decimate 1 case and less than 56% for the decimate 2 case. So for 1 camera on a RPi4, 4 threads seems like a reasonable default to get the highest FPS. |




Description
This PR changes the default thread count for the AprilTag detector from
4to2inAprilTagPipelineSettings.java.Why:
As discussed in issue #2125, running 4 threads by default can cause thread contention on resource-constrained coprocessors (like Raspberry Pi 4/5), potentially starving other critical processes like NetworkTables or the web server.
Benchmarks:
I tested this on a local build using standard AprilTag test images. While the absolute FPS was high due to host hardware (MacBook Pro), the latency scaling confirms the diminishing returns of higher thread counts:
The data suggests that 2 threads hits the optimal balance between performance and resource efficiency.
Closes #2125