-
Notifications
You must be signed in to change notification settings - Fork 182
Description
The problem
The subprocess I/O loop in borg_job.py is less efficient than it could be. Currently, the code uses async I/O however it spins continually on 0.01 interval timer. Subsequently, each fd is unconditionally read from. Its is fairly common for both file descriptors to not have waiting data resulting in 2 exceptions being raised/caught following 2 system calls. This also ignores the continual polling of the subprocess status. This results in more load on the system than necessary due to between 2-3 syscalls that never do useful work and iterating a python loop.
Requested Solution
I already have code that does this in a branch, I'm raising this due to the policy stated on PRs. If you would like to consider it I am happy to open a PR.
select returns fds that are ready. This can drive which fds are actually read from. Additionally, when a subprocess pipe closes the fds will be marked as ready and the next read will return no data to signal EOF. In this way it is possible to block forever in select and read until EOF. Subsequently, once both fds are exhausted, the only thing to do is wait forever for the subprocess to exit which alleviates that polling.
Alternatives
Additional context
I noticed this behavior as I was in borg_job.py to deal with 2410