feat: io_uring based file reader#5777
Conversation
PR Review: feat: io_uring based file readerSummaryThis PR adds a new P0 Issues (Must Fix)1. Short Read Handling Missing ( When io_uring completes a read, the if result < 0 {
state.err = Some(io::Error::from_raw_os_error(-result));
} else if request.length > 0 {
// Should check: result as usize != request.lengthThe same issue exists in 2. Panic in push_request(request.clone()).unwrap();This will panic if the submission queue is full ( P1 Issues (Should Fix)3. File Handle Cache May Serve Stale Data ( The file handle cache uses a 60-second TTL with 10K max capacity. If a file is modified or replaced after being cached, readers will continue using the old file descriptor. Consider:
4. Busy-Spin in Current-Thread Future ( cx.waker().wake_by_ref();
Poll::PendingWhen the request hasn't completed, this immediately wakes itself, causing a busy spin that will consume 100% CPU until completion. Consider yielding back to the executor properly. 5. Reader Trait Lifetime Change Breaking Change ( The Minor Observations
|
31c5186 to
61040ad
Compare
61040ad to
f8e8b73
Compare
2b939c4 to
a8f8ea2
Compare
wjones127
left a comment
There was a problem hiding this comment.
This looks great. I have a couple minor non-blocking questions, but feel free to merge as is.
| /// Global cache of open file handles. | ||
| /// Entries expire after 60 seconds to ensure files are eventually closed. | ||
| pub(super) static HANDLE_CACHE: LazyLock<moka::future::Cache<CacheKey, CachedReaderData>> = |
There was a problem hiding this comment.
praise: this is going to be very useful!
In another PR, I'm separating the data from file handles in the index cache (so that entries are serializable). Having a separate cache to make re-opening file handles might be useful.
15235c4 to
618f5d8
Compare
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
The completion handlers previously ignored the positive return value from cqe.result(), which is the number of bytes actually read. On a short read (e.g. signal interruption), the full pre-allocated buffer was returned with uninitialized tail bytes. Reading past EOF returned garbage data instead of an error. Now partial reads accumulate bytes and resubmit for the remainder, while EOF (result == 0) before the full read completes returns an UnexpectedEof error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When push_to_sq failed (submission queue full), the request was silently dropped — no error was set, completed was never marked true, and no waker was called. The future would hang forever waiting for a completion that could never arrive. Add IoRequest::fail() to centralize the "complete with error" pattern, and call it from push_to_sq on both failure paths (SQ full and push error). Also fix the current_thread retry resubmission to fail the request and fix pending_count bookkeeping on retry failure. Add unit tests that create a tiny IoUring (queue_depth=2), fill the SQ, and verify the 3rd request's future returns an error within a 2-second timeout instead of hanging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The field was never read, only written. Its bookkeeping was also incorrect on the short-read retry path. Remove it entirely rather than fixing dead code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
submit_read previously called push_request().unwrap(), which panics if the submission queue is full. Return an error future instead, matching the pattern used by the thread-based UringReader. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace disallowed snafu::location!() macro with Error::not_found() in uring reader modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use LazyCell for thread-local io_uring initialization (panics on failure instead of propagating errors, per reviewer suggestion) - Cache LANCE_URING_BLOCK_SIZE env var in a LazyLock static - Use str_is_truthy for LANCE_URING_CURRENT_THREAD parsing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Box deep async sub-futures in build() to break the type-level recursion chain that exceeded the default recursion_limit of 128. The io_uring wrappers added in this PR made I/O futures slightly larger, pushing the compiler's query depth from ~128 to 130 when computing the async state machine layout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
0046d6d to
55b745a
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This is still a draft while waiting on #5755 and #5773This PR adds a new URI scheme
file+uring. The scheme uses the same local file reader asfilebut has two customReaderimplementations that are based on the io_uring API. One of these creates a configurable number of process-wide ring threads and the reader communicates with this thread using a queue. The second assumes that the scheduler and decoder run on the same thread and uses a thread local uring instance.Both are able to saturate up to 1.5M IOPS/s when combined with the scheduler rework. I've tested the thread local variant up to 2M IOPS/s. These numbers are assuming the data is not in the kernel page cache. I've seen results as high as 4M IOPS/s when the data is cached.