Skip to content

feat(bindings/java): Speed up the performance of OperatorInputStream#7088

Merged
Xuanwo merged 5 commits intomainfrom
xuanwo/java-get-perf
Dec 22, 2025
Merged

feat(bindings/java): Speed up the performance of OperatorInputStream#7088
Xuanwo merged 5 commits intomainfrom
xuanwo/java-get-perf

Conversation

@Xuanwo
Copy link
Copy Markdown
Member

@Xuanwo Xuanwo commented Dec 22, 2025

Which issue does this PR close?

Closes #.

Rationale for this change

This PR is used to speed up the performance of OperatorInputStream:

Before:

❯ java -jar target/benchmarks.jar org.apache.opendal.bench.jmh.OperatorReadJmhBenchmark

Benchmark                                       MiB/s
org.apache.opendal.bench.jmh.OperatorReadJmhBenchmark.createInputStream   1468.264
org.apache.opendal.bench.jmh.OperatorReadJmhBenchmark.readRange  44586.656

After:

❯ java -jar target/benchmarks.jar org.apache.opendal.bench.jmh.OperatorReadJmhBenchmark
Benchmark                                       MiB/s
org.apache.opendal.bench.jmh.OperatorReadJmhBenchmark.createInputStream  22430.267
org.apache.opendal.bench.jmh.OperatorReadJmhBenchmark.readRange  48399.451

Still slower than readRange but much faster than before.

What changes are included in this PR?

  • Add a basic benchmark framework for opendal java
  • Implement int read(byte[] b, int off, int len) for OperatorInputStream

Are there any user-facing changes?

AI Usage Statement

Parts of this PR were drafted with assistance from Codex (with gpt-5.2) and fully reviewed and edited by me. I take full responsibility for all changes.

@Xuanwo Xuanwo requested a review from tisonkun as a code owner December 22, 2025 14:55
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. releases-note/feat The PR implements a new feature or has a title that begins with "feat" labels Dec 22, 2025
ENV.with(|cell| {
let mut env = vm
.attach_current_thread_permanently()
.attach_current_thread_as_daemon()
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that if we don't change this, the program will hang indefinitely in JMH.

cc @tisonkun what do you think about this change?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may tag @rib here.

Later we can try to migrate to jni-rs 0.22 but there should be some similar issues.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A hang from not using _as_daemon implies you have some code that is using JavaVM::destroy() / DestroyJavaVM which is probably unsafe / unsound in any project with RAII types (including global references from the jni crate) that may result in jni usage after the vm is destroyed.

The most likely reason it appears to hang is because DestroyJavaVM will block and wait for (non-daemon) jni threads to exit.

Attaching as daemon threads will avoid that blocking but will almost certainly be completely unsound.

The hang itself indicates that the non-blocking case is unsound because it implies you have non-daemon threads that are associated with the jni crate which has RAII types that may trigger JNI usage after the VM has been destroyed.

The safe fix to avoid the blocking would be to have a mechanism to terminate any jni threads you have (e.g. destroying your tokio runtime + thread pool). Or don't ever try and destroy the VM (this is a very poorly defined operation).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the code itself there is no JavaVM::destroy() explicitly. But perhaps there are some implict calls.

Thanks for your information. I'll try to figure it out.

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Dec 22, 2025
Comment thread bindings/java/src/jmh/java/org/apache/opendal/bench/jmh/Main.java Outdated
Comment on lines +73 to +85
while (bytes != null && offset >= bytes.length) {
bytes = readNextBytes(reader.nativeHandle);
offset = 0;
}

if (bytes == null) {
return -1;
}

final int n = Math.min(len, bytes.length - offset);
System.arraycopy(bytes, offset, b, off, n);
offset += n;
return n;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps read as much as len as possible?

Currently, it reads what remaining in the bytes buffer or read at most once.

Comment on lines +73 to +76
while (bytes != null && offset >= bytes.length) {
bytes = readNextBytes(reader.nativeHandle);
offset = 0;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird while here.

Perhaps loop over len > 0 and try to copy from bytes until len fulfilled or end.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Dec 22, 2025
Copy link
Copy Markdown
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

The read impl can be improved later.

The attach_current_thread_as_daemon should be resolved as it's not recommended. But we need some help from @rib and perhaps include it when testing out jni-rs 0.22.

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Dec 22, 2025
@Xuanwo Xuanwo merged commit d78b5e5 into main Dec 22, 2025
76 checks passed
@Xuanwo Xuanwo deleted the xuanwo/java-get-perf branch December 22, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer releases-note/feat The PR implements a new feature or has a title that begins with "feat" size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants