HADOOP-19795: ABFS. GetPathStatus Optimization on OpenFileForRead by manika137 · Pull Request #8212 · apache/hadoop

manika137 · 2026-01-27T10:18:50Z

Description of PR

JIRA: https://issues.apache.org/jira/browse/HADOOP-19795

We do a getPathStatus call during file open for read. This call is primarily used to fetch the file’s metadata properties before the actual read begins.
We are now introducing an optional, config-driven read flow that avoids the getPathStatus call during open and instead derives required metadata from the read response itself.

How was this patch tested?

New tests were added and test suite was run. Adding the results in comments below

manika137 · 2026-01-27T10:19:54Z

Test Results

============================================================
HNS-OAuth-DFS

[WARNING] Tests run: 250, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 908, Failures: 0, Errors: 0, Skipped: 222
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 8
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 23

============================================================
HNS-SharedKey-DFS

[WARNING] Tests run: 250, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 911, Failures: 0, Errors: 0, Skipped: 168
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 8
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 10

============================================================
AppendBlob-HNS-OAuth-DFS

[WARNING] Tests run: 250, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 908, Failures: 0, Errors: 0, Skipped: 233
[WARNING] Tests run: 135, Failures: 0, Errors: 0, Skipped: 9
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 23

============================================================
NonHNS-SharedKey-Blob

[WARNING] Tests run: 250, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 758, Failures: 0, Errors: 0, Skipped: 155
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 11

============================================================
NonHNS-OAuth-Blob

[ERROR] Tests run: 250, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 755, Failures: 0, Errors: 0, Skipped: 156
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 24

============================================================
AppendBlob-NonHNS-OAuth-Blob

[WARNING] Tests run: 250, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 750, Failures: 0, Errors: 0, Skipped: 202
[WARNING] Tests run: 135, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 24

============================================================
HNS-Oauth-DFS-IngressBlob

[WARNING] Tests run: 250, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 782, Failures: 0, Errors: 0, Skipped: 231
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 8
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 23

hadoop-yetus · 2026-01-27T12:47:09Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	5m 41s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	54m 33s		trunk passed
+1 💚	compile	1m 6s		trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚	compile	1m 4s		trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚	checkstyle	0m 54s		trunk passed
+1 💚	mvnsite	1m 12s		trunk passed
-1 ❌	javadoc	1m 2s	/branch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt	hadoop-azure in trunk failed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04.
-1 ❌	javadoc	0m 58s	/branch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt	hadoop-azure in trunk failed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04.
+1 💚	spotbugs	1m 46s		trunk passed
+1 💚	shadedclient	33m 55s		branch has no errors when building and testing our client artifacts.
-0 ⚠️	patch	34m 29s		Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 38s		the patch passed
+1 💚	compile	0m 36s		the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚	javac	0m 36s		the patch passed
+1 💚	compile	0m 38s		the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚	javac	0m 38s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 23s	/results-checkstyle-hadoop-tools_hadoop-azure.txt	hadoop-tools/hadoop-azure: The patch generated 16 new + 2 unchanged - 0 fixed = 18 total (was 2)
+1 💚	mvnsite	0m 42s		the patch passed
-1 ❌	javadoc	0m 30s	/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt	hadoop-azure in the patch failed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04.
-1 ❌	javadoc	0m 30s	/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt	hadoop-azure in the patch failed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04.
-1 ❌	spotbugs	1m 28s	/new-spotbugs-hadoop-tools_hadoop-azure.html	hadoop-tools/hadoop-azure generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚	shadedclient	32m 42s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	3m 22s		hadoop-azure in the patch passed.
+1 💚	asflicense	0m 34s		The patch does not generate ASF License warnings.
		146m 41s

Reason	Tests
SpotBugs	module:hadoop-tools/hadoop-azure
	Inconsistent synchronization of org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.contentLength; locked 80% of time Unsynchronized access at AbfsInputStream.java:80% of time Unsynchronized access at AbfsInputStream.java:[line 629]

Subsystem	Report/Notes
Docker	ClientAPI=1.53 ServerAPI=1.53 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8212/1/artifact/out/Dockerfile
GITHUB PR	#8212
JIRA Issue	HADOOP-19795
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux a1c89b292bca 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `e13fd56`
Default Java	Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions	/usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8212/1/testReport/
Max. process+thread count	570 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8212/1/console
versions	git=2.25.1 maven=3.9.11 spotbugs=4.9.7
Powered by	Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

anmolanmol1234 · 2026-01-29T08:19:22Z

...tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java

-          getClient().getEncryptionType() != EncryptionType.ENCRYPTION_CONTEXT
-              || ((VersionedFileStatus) fileStatus).getEncryptionContext()
-              != null)) {
+              getClient().getEncryptionType() != EncryptionType.ENCRYPTION_CONTEXT


additional space changes can be reverted

anmolanmol1234 · 2026-01-29T08:21:34Z

...tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java

          contextEncryptionAdapter = new ContextProviderEncryptionAdapter(
-              getClient().getEncryptionContextProvider(), getRelativePath(path),
-              fileEncryptionContext.getBytes(StandardCharsets.UTF_8));
+                  getClient().getEncryptionContextProvider(), getRelativePath(path),


same as above

anmolanmol1234 · 2026-01-29T10:15:25Z

...tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java

+                  encryptionContext.getBytes(StandardCharsets.UTF_8));
        }
-      } else {
+        if (parseIsDirectory(resourceType)) {


Can be moved to common part as is getting checked in both the cases

anmolanmol1234 · 2026-01-29T10:29:31Z

...tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java

+       * - restrictGpsOnOpenFile config is enabled with null FileStatus and encryptionType not as ENCRYPTION_CONTEXT
+       * In this case, we don't need to call GetPathStatus API.
+       */
+      else {


Won't this lead to going ahead and opening the stream without checks? Do we fail later for this case ?

anmolanmol1234 · 2026-01-29T10:30:14Z

...re/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/services/AzureServiceErrorCode.java

  INVALID_APPEND_OPERATION("InvalidAppendOperation", HttpURLConnection.HTTP_CONFLICT, null),
  UNAUTHORIZED_BLOB_OVERWRITE("UnauthorizedBlobOverwrite", HttpURLConnection.HTTP_FORBIDDEN,
          "This request is not authorized to perform blob overwrites."),
+  INVALID_RANGE("InvalidRange", 416,


416 should come from a constant defined in HttpURLConnection class

anmolanmol1234 · 2026-01-29T10:31:01Z

...doop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAdaptiveInputStream.java

      // Reset Read Type back to normal and set again based on code flow.
      getTracingContext().setReadType(ReadType.NORMAL_READ);
-      if (shouldAlwaysReadBufferSize()) {
+      if(shouldRestrictGpsOnOpenFile() && isFirstRead()) {


nit: space after if

anmolanmol1234 · 2026-01-29T10:31:21Z

...doop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAdaptiveInputStream.java

      // Reset Read Type back to normal and set again based on code flow.
      getTracingContext().setReadType(ReadType.NORMAL_READ);
-      if (shouldAlwaysReadBufferSize()) {
+      if(shouldRestrictGpsOnOpenFile() && isFirstRead()) {


add a comment for this condition as well

anmolanmol1234 · 2026-01-29T10:32:16Z