Apache Airflow version
3.0.2
If "Other Airflow 2 version" selected, which one?
No response
What happened?
When creating an Asset using an ObjectStoragePath as the uri parameter if a connection is in use then the _sanitize_uri method used to validate the URI fails with:
"An Asset URI should not contain auth info (e.g. username or password). It has been automatically dropped."
This seems to be due to using a connection_id as here in _sanitize_uri any Asset uri with a userinfo element get's flagged.
What you think should happen instead?
At minimum the message should be clear that it has only removed the userinfo from the Asset uri and it has still been created. As it stands the warning is ambiguous as to what has been "dropped" with the Asset itself being the most likely.
Mangling the user's input, in some cases silently, is just a bad idea. If it's accepted with a warning the implication is it was stored as-is. If userinfo just won't be handled it should be rejected as an error.
Assets should really just accept any valid URI and definitely those generated by other parts of Airflow. As it stands I can't store a URI with an Asset and recover it as s3://conn_1@bucket/data s3://conn_2@bucket/data aren't necessarily the same.
How to reproduce
def test_objectstoragepath_asset():
path = ObjectStoragePath("s3://example/", conn_id="test")
asset = Asset(uri=path)
assert asset.uri == path.as_uri()
's3://example/' != 's3://test@example/'
Expected :'s3://test@example/'
Actual :'s3://example/'
Operating System
N/A
Versions of Apache Airflow Providers
No response
Deployment
Docker-Compose
Deployment details
On an OpenStack internal cloud using third-party S3 store.
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
Apache Airflow version
3.0.2
If "Other Airflow 2 version" selected, which one?
No response
What happened?
When creating an Asset using an ObjectStoragePath as the
uriparameter if a connection is in use then the_sanitize_urimethod used to validate the URI fails with:This seems to be due to using a connection_id as here in
_sanitize_uriany Asseturiwith a userinfo element get's flagged.What you think should happen instead?
At minimum the message should be clear that it has only removed the userinfo from the Asset uri and it has still been created. As it stands the warning is ambiguous as to what has been "dropped" with the Asset itself being the most likely.
Mangling the user's input, in some cases silently, is just a bad idea. If it's accepted with a warning the implication is it was stored as-is. If userinfo just won't be handled it should be rejected as an error.
Assets should really just accept any valid URI and definitely those generated by other parts of Airflow. As it stands I can't store a URI with an Asset and recover it as
s3://conn_1@bucket/datas3://conn_2@bucket/dataaren't necessarily the same.How to reproduce
Operating System
N/A
Versions of Apache Airflow Providers
No response
Deployment
Docker-Compose
Deployment details
On an OpenStack internal cloud using third-party S3 store.
Anything else?
No response
Are you willing to submit PR?
Code of Conduct