HADOOP-19736: ABFS. Support for new auth type: User-bound SAS#8051
HADOOP-19736: ABFS. Support for new auth type: User-bound SAS#8051anujmodi2021 merged 19 commits intoapache:trunkfrom
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
| AUG_03_2023("2023-08-03"), | ||
| NOV_04_2024("2024-11-04"); | ||
| NOV_04_2024("2024-11-04"), | ||
| JULY_05_2025("2025-07-05"); |
There was a problem hiding this comment.
We should follow the same format: JUL_05_2025, what do you think?
| abfsClientContext); | ||
| } | ||
|
|
||
| public AbfsClientHandler(final URL baseUrl, |
There was a problem hiding this comment.
Java doc missing for the constructor
| this.sasTokenProvider = sasTokenProvider; | ||
| } | ||
|
|
||
| public AbfsClient(final URL baseUrl, |
| encryptionContextProvider, abfsClientContext, AbfsServiceType.DFS); | ||
| } | ||
|
|
||
| public AbfsDfsClient(final URL baseUrl, |
| case UserboundSASWithOAuth: | ||
| httpOperation.setRequestProperty(HttpHeaderConfigurations.AUTHORIZATION, | ||
| client.getAccessToken()); | ||
| httpOperation.setMaskForSAS(); //mask sig/oid from url for logs |
There was a problem hiding this comment.
sig stands for signature
| tokenProvider, sasTokenProvider, encryptionContextProvider, | ||
| populateAbfsClientContext()); | ||
| } | ||
| else if (tokenProvider != null) { |
|
|
||
| LOG.trace("Initializing AbfsClient for {}", baseUrl); | ||
| if (tokenProvider != null) { | ||
| if(tokenProvider != null && sasTokenProvider != null){ |
There was a problem hiding this comment.
Space between if and (
| 3. Deployed in-Azure with the Azure VMs providing OAuth 2.0 tokens to the application, "Managed Instance". | ||
| 4. Using Shared Access Signature (SAS) tokens provided by a custom implementation of the SASTokenProvider interface. | ||
| 5. By directly configuring a fixed Shared Access Signature (SAS) token in the account configuration settings files. | ||
| 6. Using user-bound SAS auth type, which is requires OAuth 2.0 setup (point 2 above) and SAS setup (point 4 above) |
There was a problem hiding this comment.
Grammatical mistake: which requires or which is required?
|
|
||
| public static final String FS_AZURE_TEST_APP_SERVICE_PRINCIPAL_OBJECT_ID = "fs.azure.test.app.service.principal.object.id"; | ||
|
|
||
| public static final String FS_AZURE_END_USER_TENANT_ID = "fs.azure.test.end.user.tenant.id"; |
There was a problem hiding this comment.
Rename the variable to FS_AZURE_TEST_END_USER_TENANT_ID
There was a problem hiding this comment.
I think TEST should come first. Check for how other test related configs are defined. TestConfigurationKeys.java have them
| public static final String FS_AZURE_TEST_APP_SERVICE_PRINCIPAL_OBJECT_ID = "fs.azure.test.app.service.principal.object.id"; | ||
|
|
||
| public static final String FS_AZURE_END_USER_TENANT_ID = "fs.azure.test.end.user.tenant.id"; | ||
| public static final String FS_AZURE_END_USER_OBJECT_ID = "fs.azure.test.end.user.object.id"; |
| Dec19("2019-12-12"), | ||
| Feb20("2020-02-10"); | ||
| Feb20("2020-02-10"), | ||
| July5("2025-07-05"); |
There was a problem hiding this comment.
Same here should be JUL
| private final String sduoid; | ||
|
|
||
| public DelegationSASGenerator(byte[] userDelegationKey, String skoid, String sktid, String skt, String ske, String skv) { | ||
| public DelegationSASGenerator(byte[] userDelegationKey, String skoid, String sktid, String skt, String ske, String skv, String skdutid, String sduoid) { |
There was a problem hiding this comment.
add javadoc for what do all these params signify
| qb.addQuery("skv", skv); | ||
|
|
||
| //skdutid and sduoid are required for user bound SAS only | ||
| if(!Objects.equals(skdutid, EMPTY_STRING)){ |
|
|
||
| String stringToSign = sb.toString(); | ||
| LOG.debug("Delegation SAS stringToSign: " + stringToSign.replace("\n", ".")); | ||
| System.out.println("Delegation SAS stringToSign: " + stringToSign.replace("\n", ".")); |
|
|
||
| // Invokes the AAD v2.0 authentication endpoint with a client credentials grant to get an | ||
| // access token. See https://docs.microsoft.com/en-us/azure/active-directory/develop/v2-oauth2-client-creds-grant-flow. | ||
| private String getAuthorizationHeader(String accountName, String appID, String appSecret, String sktid) throws IOException { |
There was a problem hiding this comment.
include params in javadoc as well
| // Invokes the AAD v2.0 authentication endpoint with a client credentials grant to get an | ||
| // access token. See https://docs.microsoft.com/en-us/azure/active-directory/develop/v2-oauth2-client-creds-grant-flow. | ||
| private String getAuthorizationHeader(String accountName, String appID, String appSecret, String sktid) throws IOException { | ||
| String authEndPoint = String.format("https://login.microsoftonline.com/%s/oauth2/v2.0/token", sktid); |
There was a problem hiding this comment.
Add a constant for the string
| return "Bearer " + provider.getToken().getAccessToken(); | ||
| } | ||
|
|
||
| private byte[] getUserDelegationKey(String accountName, String appID, String appSecret, |
| private byte[] getUserDelegationKey(String accountName, String appID, String appSecret, | ||
| String sktid, String skt, String ske, String skv, String skdutid) throws IOException { | ||
|
|
||
| String method = "POST"; |
There was a problem hiding this comment.
we have constants for HTTP methods, can be used here
| final StringBuilder sb = new StringBuilder(128); | ||
| sb.append("https://"); | ||
| sb.append(account); | ||
| sb.append(".blob.core.windows.net/?restype=service&comp=userdelegationkey"); |
There was a problem hiding this comment.
Try to use constants as much as possible
| requestBody.append(skdutid); | ||
| requestBody.append("</DelegatedUserTid></KeyInfo>"); | ||
|
|
||
| // requestBody.append("<?xml version=\"1.0\" encoding=\"utf-8\"?><KeyInfo><Start>"); |
|
|
||
| public static ApiVersion getCurrentVersion() { | ||
| return NOV_04_2024; | ||
| return JULY_05_2025; |
There was a problem hiding this comment.
Are we bumping up version for all the requests?
Wasn't this version bump only needed for Auth related APIs?
There was a problem hiding this comment.
Right- we're already overriding the API version header according to skv param for UBS requests. We dont need it here- removed
| final AbfsClientContext abfsClientContext) throws IOException { | ||
| URL dfsUrl = changeUrlFromBlobToDfs(baseUrl); | ||
| if (tokenProvider != null) { | ||
| if (tokenProvider != null && sasTokenProvider != null) { |
There was a problem hiding this comment.
Apart from UDS, can there be a case where caller can send both as not null? Makes ure no flow leads to this and also add a test around it.
There was a problem hiding this comment.
We are initialising both the token providers only for UBS auth type. Added a test for token provider expectations (null or non-null) according to the auth type
| * @return sasTokenProvider object based on configurations provided | ||
| * @throws AzureBlobFileSystemException | ||
| */ | ||
| public SASTokenProvider getSASTokenProviderForUserBoundSAS() throws AzureBlobFileSystemException { |
There was a problem hiding this comment.
nit: Can be renamed as getUserBoundSASTokenProvider
| LOG.trace("Fetching SAS Token Provider"); | ||
| sasTokenProvider = abfsConfiguration.getSASTokenProvider(); | ||
| } else { | ||
| } else if(authType == AuthType.UserboundSASWithOAuth){ |
There was a problem hiding this comment.
Here also, it can be combined into a single call with both passed.
There was a problem hiding this comment.
Added a method to get both together
|
|
||
| public static final String FS_AZURE_TEST_APP_SERVICE_PRINCIPAL_OBJECT_ID = "fs.azure.test.app.service.principal.object.id"; | ||
|
|
||
| public static final String FS_AZURE_END_USER_TENANT_ID = "fs.azure.test.end.user.tenant.id"; |
There was a problem hiding this comment.
I think TEST should come first. Check for how other test related configs are defined. TestConfigurationKeys.java have them
| final AbfsClientContext abfsClientContext) throws IOException { | ||
| URL dfsUrl = changeUrlFromBlobToDfs(baseUrl); | ||
| if (tokenProvider != null) { | ||
| if (tokenProvider != null && sasTokenProvider != null) { |
There was a problem hiding this comment.
All these conditions can again be removed by creating a single constructor for AbfsDfsClient and AbfsBlobClient that sends both access token provider and sas token provider (like we did for client handler and parent constructors)
But would it be better to have it this way to separate out the logging here?
There was a problem hiding this comment.
Logging as well can be combined. in log we will see null value for the one which is not supposed to be used and we will know.
Better to combine everywhere IMO.
|
💔 -1 overall
This message was automatically generated. |
| final AbfsClientContext abfsClientContext) throws IOException { | ||
| URL dfsUrl = changeUrlFromBlobToDfs(baseUrl); | ||
| if (tokenProvider != null) { | ||
| if (tokenProvider != null && sasTokenProvider != null) { |
There was a problem hiding this comment.
Logging as well can be combined. in log we will see null value for the one which is not supposed to be used and we will know.
Better to combine everywhere IMO.
85f1a30 to
ec2c76c
Compare
|
💔 -1 overall
This message was automatically generated. |
700cce3 to
0bbbd36
Compare
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
| throw e; | ||
| } catch (Exception e) { | ||
| throw new SASTokenProviderException( | ||
| "Unable to load user-bound SAS token provider class: " + e, e); |
There was a problem hiding this comment.
e is included in the message string and also passed as the cause in the exception constructor
| * Tests listing and deleting files under an implicit directory | ||
| */ | ||
| @Test | ||
| public void testOperationsForImplicitPaths() throws Exception { |
There was a problem hiding this comment.
In the setup we don't assume non hns for all cases, shouldn't we add that assumption here ?
There was a problem hiding this comment.
In setup, if non-hns, we're assuming it to be Blob endpoint. It won't run for FNS-DFS
In this test, we're only running for Blob endpoint as well
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
| // we do not support creating the filesystem when AuthType is SAS or UserboundSASWithOAuth | ||
| return this.createRemoteFileSystemDuringInitialization | ||
| && this.getAuthType(this.accountName) != AuthType.SAS; | ||
| && this.getAuthType(this.accountName) != AuthType.SAS |
There was a problem hiding this comment.
We have used this statement at multiple places to check if authtype is SAS or UserBoundSAS, wouldn't it be better if we can have a method which returns true if AuthType is SAS or UserBoundSAS else false.
There was a problem hiding this comment.
taken. added a separate method for it
| public AccessTokenProvider getTokenProvider() throws TokenAccessProviderException { | ||
| AuthType authType = getEnum(FS_AZURE_ACCOUNT_AUTH_TYPE_PROPERTY_NAME, AuthType.SharedKey); | ||
| if (authType == AuthType.OAuth) { | ||
| if (authType == AuthType.OAuth || authType == AuthType.UserboundSASWithOAuth) { |
There was a problem hiding this comment.
this is with oauth check
| ``` | ||
|
|
||
| ### <a name="oauth-user-and-passwd"></a> OAuth 2.0: Username and Password | ||
| #### <a name="oauth-user-and-passwd"></a> Username and Password |
There was a problem hiding this comment.
What is the need for this change?
There was a problem hiding this comment.
For better readability for users. We have added OAuth as a separate section and all supported provider types as subsections. This would ensure for user-bound SAS there's no confusion for what all oauth token provider options are and what auth type they need to select for user-delegation SAS, Oauth, user bound SAS
| ``` | ||
|
|
||
| ### <a name="managed-identity"></a> Azure Managed Identity | ||
| #### <a name="managed-identity"></a> Azure Managed Identity |
There was a problem hiding this comment.
This change is not needed.
There was a problem hiding this comment.
for better readability
| ``` | ||
|
|
||
| ### <a name="workload-identity"></a> Azure Workload Identity | ||
| #### <a name="workload-identity"></a> Azure Workload Identity |
There was a problem hiding this comment.
for better readability
anujmodi2021
left a comment
There was a problem hiding this comment.
Thanks for taking all the comments.
LGTM
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
Test Results============================================================
|
|
Thank you everyone who reviewed. Other Yetus checks looks good. Going ahead with the merge. |
Description of PR
JIRA: https://issues.apache.org/jira/browse/HADOOP-19736
Adding support for new authentication type: user bound SAS
User-bound SAS (Shared Access Signature) binds a SAS token to a specific user identity rather than just granting access based on possession of the token. This approach addresses key vulnerabilities in previous SAS mechanisms.
The SAS token for it includes identity-binding parameters (e.g., skdutid, sduoid) that correspond to the user’s Entra tenant and object ID.
When accessing storage, the user must present a valid Entra access token matching these parameters.
How was this patch tested?
Test suite was run. Results added in the comments below