Skip to content

Python3.12#36755

Merged
potiuk merged 1 commit intomainfrom
python3.12
Mar 11, 2024
Merged

Python3.12#36755
potiuk merged 1 commit intomainfrom
python3.12

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Jan 12, 2024


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Jan 12, 2024

cc: @dirrao @Taragolis -> seems like apache-beam having numpy as dependency is the next problem to solve after pendulum is solved

 Downloading numpy-1.24.4.tar.gz (10.9 MB)
  #56 70.61            ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.9/10.9 MB 36.6 MB/s eta 0:00:00
  #56 70.61         Installing build dependencies: started
  #56 70.61         Installing build dependencies: finished with status 'done'
  #56 70.61         Getting requirements to build wheel: started
  #56 70.61         Getting requirements to build wheel: finished with status 'error'
  #56 70.61         error: subprocess-exited-with-error
  #56 70.61       
  #56 70.61         × Getting requirements to build wheel did not run successfully.
  #56 70.61         │ exit code: 1
  #56 70.61         ╰─> [33 lines of output]
  #56 70.61             Traceback (most recent call last):
  #56 70.61               File "/usr/local/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
  #56 70.61                 main()
  #56 70.61               File "/usr/local/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
  #56 70.61                 json_out['return_val'] = hook(**hook_input['kwargs'])
  #56 70.61                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  #56 70.61               File "/usr/local/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 112, in get_requires_for_build_wheel
  #56 70.61                 backend = _build_backend()
  #56 70.61                           ^^^^^^^^^^^^^^^^
  #56 70.61               File "/usr/local/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 77, in _build_backend
  #56 70.61                 obj = import_module(mod_path)
  #56 70.61                       ^^^^^^^^^^^^^^^^^^^^^^^

@potiuk
Copy link
Member Author

potiuk commented Jan 12, 2024

Looks like we need NumPy 1.26+ - from that long discussion here: numpy/numpy#23808 and
Apache Beam has <1.25 even in main: https://github.com/apache/beam/blob/master/sdks/python/setup.py#L304

So likely the next best thing to do is to exclude apache-beam provider for python 3.12

This is kinda expected, Beam is always dragging us behind

@potiuk
Copy link
Member Author

potiuk commented Jan 12, 2024

Pushed a fixup marking it for exclusion - let's see.

@Taragolis
Copy link
Contributor

Seems like 1.26.0 is the first release for numpy which officially support 3.12:

@Taragolis
Copy link
Contributor

Oh... I've post my comment without refresh page, and you've already found the same things

@Taragolis
Copy link
Contributor

Taragolis commented Jan 12, 2024

Seems like Google provider also not compatible with 3.12 yet

46.9 ERROR: Could not find a version that satisfies the requirement google-ads>=22.1.0; extra == "google" (from apache-airflow[google]) (from versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.5.1, 0.5.2, 0.6.0, 0.7.0, 1.0.0, 1.0.1, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.3.1, 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.4.0, 2.4.1, 3.0.0, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.3.0, 4.0.0, 4.1.0, 4.1.1, 5.0.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.1.0, 6.0.0, 7.0.0, 8.0.0, 8.1.0, 8.2.0, 9.0.0, 10.0.0, 11.0.0, 11.0.1, 11.0.2, 12.0.0, 13.0.0, 14.0.0, 14.0.1, 14.1.0, 15.0.0, 15.1.0, 15.1.1, 16.0.0, 17.0.0, 18.0.0, 18.1.0, 18.2.0, 19.0.0, 20.0.0, 21.0.0, 21.1.0, 21.2.0, 21.3.0, 22.0.0)
  246.9 ERROR: No matching distribution found for google-ads>=22.1.0; extra == "google"

Latest google-ads package pinned to >=3.7, <3.12

Issue for add support of Python 3.12 already exists googleads/google-ads-python#813

@potiuk
Copy link
Member Author

potiuk commented Jan 12, 2024

All right ... let me exclude google provider too then. At this stage I have a feeling that excluding few - even huge and important - providers and having an open -issue to bring the 3.12 support in would be a good thing.

And I know for a fact that google team wants to split the google provider and splitting of ads was the first thing to try anyway, so that might accelerate things a bit.

@potiuk
Copy link
Member Author

potiuk commented Jan 12, 2024

Pushed.

@potiuk
Copy link
Member Author

potiuk commented Jan 12, 2024

BTW. I really like how nicely and transpartently the new pyproject.toml exclusion works now simply pre-commit updating pyproject.toml and it's immediately visible what is excluded :D

@Taragolis
Copy link
Contributor

Taragolis commented Jan 12, 2024

I think one day we need to finally resolve Consider splitting Google Provider because google provider is really huge providers (28k+ lines which tracked by our test) and contains quite a few different components:

  • Cloud (GCP)
  • Google ADS
  • Google Suite
  • LevelDB
  • Google Firebase (is it part of GCP?)

So if we found the way how it could be done it might prevent the situation that one of this component become a showstopper for others

image

@potiuk
Copy link
Member Author

potiuk commented Jan 12, 2024

I think one day we need to finally resolve Consider splitting Google Provider because google provider is really huge providers (28k+ lines which tracked by our test) and contains quite a few different components:

This is precisely the plan I am discussing with Google team :)

@Taragolis
Copy link
Contributor

So we have duckdb dependency issue now, it already has python 3.12 support which added by duckdb/duckdb#10144 (Linux and MacOS) but not released yet

@Taragolis
Copy link
Contributor

Seems 0.9.3.dev2258 pre-release is a first which supports 3.12

pip install duckdb==0.9.3.dev2258
Collecting duckdb==0.9.3.dev2258
  Obtaining dependency information for duckdb==0.9.3.dev2258 from https://files.pythonhosted.org/packages/3c/43/094637a1939e8ba6ae53a788bd46adfd0b71fe0a6e182c8e6179b2966e09/duckdb-0.9.3.dev2258-cp312-cp312-macosx_11_0_arm64.whl.metadata
  Downloading duckdb-0.9.3.dev2258-cp312-cp312-macosx_11_0_arm64.whl.metadata (768 bytes)
Downloading duckdb-0.9.3.dev2258-cp312-cp312-macosx_11_0_arm64.whl (13.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.7/13.7 MB 3.6 MB/s eta 0:00:00
Installing collected packages: duckdb
Successfully installed duckdb-0.9.3.dev2258

@potiuk
Copy link
Member Author

potiuk commented Jan 12, 2024

Pushed a change for it :)

@Taragolis
Copy link
Contributor

One more step and a new error, this time it is related to the LevelDB, which is also part of google provider, but I guess it has separate extra

I think this is the same issue: wbolster/plyvel#158

@potiuk
Copy link
Member Author

potiuk commented Jan 13, 2024

Running

@dirrao
Copy link
Contributor

dirrao commented Jan 14, 2024

Build Prod Images still picking the google ads provider.

#56 25.60 ERROR: Ignored the following versions that require a different python version: 2.7.3 Requires-Python <3.12,~=3.8; 2.7.3rc1 Requires-Python <3.12,~=3.8; 2.8.0 Requires-Python <3.12,~=3.8; 2.8.0b1 Requires-Python <3.12,~=3.8; 2.8.0rc1 Requires-Python <3.12,~=3.8; 2.8.0rc2 Requires-Python <3.12,~=3.8; 2.8.0rc3 Requires-Python <3.12,~=3.8; 2.8.0rc4 Requires-Python <3.12,~=3.8; 22.1.0 Requires-Python >=3.7, <3.12
[17469](https://github.com/apache/airflow/actions/runs/7509965786/job/20447997500?pr=36755#step:5:17544)
  #56 25.60 ERROR: Could not find a version that satisfies the requirement google-ads>=22.1.0 (from apache-airflow-providers-google) (from versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.5.1, 0.5.2, 0.6.0, 0.7.0, 1.0.0, 1.0.1, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.3.1, 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.4.0, 2.4.1, 3.0.0, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.3.0, 4.0.0, 4.1.0, 4.1.1, 5.0.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.1.0, 6.0.0, 7.0.0, 8.0.0, 8.1.0, 8.2.0, 9.0.0, 10.0.0, 11.0.0, 11.0.1, 11.0.2, 12.0.0, 13.0.0, 14.0.0, 14.0.1, 14.1.0, 15.0.0, 15.1.0, 15.1.1, 16.0.0, 17.0.0, 18.0.0, 18.1.0, 18.2.0, 19.0.0, 20.0.0, 21.0.0, 21.1.0, 21.2.0, 21.3.0, 22.0.0)
[17470](https://github.com/apache/airflow/actions/runs/7509965786/job/20447997500?pr=36755#step:5:17545)
  #56 25.60 ERROR: No matching distribution found for google-ads>=22.1.0
[17471](https://github.com/apache/airflow/actions/runs/7509965786/job/20447997500?pr=36755#step:5:17546)

@potiuk
Copy link
Member Author

potiuk commented Jan 14, 2024

Yes. Because google provider is installed and epxected to be installed when PROD image is built. So if we do not build it locally for Pythin 3.12, it will install the one from PyPI. Luckily it looks like google ads maintainers are going to release 3.12-compatible version this week googleads/google-ads-python#813 (comment) so we should - I think - wait for it. Releasing 3.12 version without google provider, when we know we will likely be able to install it in two days, is likely just not worth the effort (we would have to add a code to exclude certain providers from PROD image installation.

In the meatime - we could take a close look at the failing tests for Python 3.13 https://github.com/apache/airflow/actions/runs/7509965786/job/20448000561?pr=36755)

I think they mostly fail because google and beam providers are missing - but if there are any other tests we should look at them. I have not looked in detail yet but there are at least a few with "real" 3.12 incompatiblities (in test code at least) that could be fixed in the meantime:

           AttributeError: 'called_once' is not a valid assertion. Use a spec for the mock if 'called_once' is meant to be an attribute.

@potiuk potiuk force-pushed the python3.12 branch 2 times, most recently from 42c1339 to 2dee02f Compare January 23, 2024 20:11
@potiuk
Copy link
Member Author

potiuk commented Jan 23, 2024

cc: @dirrao @Taragolis -> Ads released with 3.12 support https://pypi.org/project/google-ads/ - removed the limit from Google provider, let's see.

@potiuk
Copy link
Member Author

potiuk commented Jan 23, 2024

Duckdb removed devel version we had pinned for 3.12 -> replaced it with >= for the new devel

@potiuk
Copy link
Member Author

potiuk commented Feb 20, 2024

🤞 🤞 🤞 🤞 🤞 🤞

@potiuk
Copy link
Member Author

potiuk commented Feb 20, 2024

As I said if packages are always installed as wheel is would work out of the box, libev should be baked into the wheel.

If not, and you can enforce libev install before install from pip, it would work as well, see
https://docs.datastax.com/en/developer/python-driver/3.14/installation/#libev-support for more details of need thing to be able to compile from source.

@fruch . Just double checked it and for Python 3.12, whells for linux do not seem to be compiled with libev support. That basically makes python 3.12 cassandra-driver whl useless as it falls back to asyncore that is missing in Python 3.12

When I install it from PyPI with wheels, our test fail:

root@d71d7cdeb0d9:/opt/airflow# apt list libev4
Listing... Done
libev4/stable,now 1:4.33-1 amd64 [installed]
root@d71d7cdeb0d9:/opt/airflow# apt list libev-dev
Listing... Done
libev-dev/stable,now 1:4.33-1 amd64 [installed]

....

Downloading cassandra_driver-3.29.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.6/19.6 MB 12.3 MB/s eta 0:00:00
Installing collected packages: cassandra-driver
Successfully installed cassandra-driver-3.29.0

....
pytest tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py

_________________________________________________________________________________________________________ ERROR collecting tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py _________________________________________________________________________________________________________
tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py:27: in <module>
    from airflow.providers.google.cloud.transfers.cassandra_to_gcs import CassandraToGCSOperator  # noqa: E402
airflow/providers/google/cloud/transfers/cassandra_to_gcs.py:34: in <module>
    from airflow.providers.apache.cassandra.hooks.cassandra import CassandraHook
airflow/providers/apache/cassandra/hooks/cassandra.py:25: in <module>
    from cassandra.cluster import Cluster, Session
cassandra/cluster.py:173: in init cassandra.cluster
    ???
E   cassandra.DependencyException: Unable to load a default connection class
E   The following exceptions were observed:
E    - The C extension needed to use libev was not found.  This probably means that you didn't have the required build dependencies when installing the driver.  See http://datastax.github.io/python-driver/installation.html#c-extensions for instructions on installing build dependencies and building the C extension.
E    - Unable to import asyncore module.  Note that this module has been removed in Python 3.12 so when using the driver with this version (or anything newer) you will need to use one of the other event loop implementations.

I tested that when I build cassandra driver from sdist, the test start to work (so libev is used)

oot@d71d7cdeb0d9:/opt/airflow# pip uninstall cassandra-driver
Found existing installation: cassandra-driver 3.29.0
Uninstalling cassandra-driver-3.29.0:
  Would remove:
    /usr/local/lib/python3.12/site-packages/cassandra/*
    /usr/local/lib/python3.12/site-packages/cassandra_driver-3.29.0.dist-info/*
Proceed (Y/n)? y
  Successfully uninstalled cassandra-driver-3.29.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
root@d71d7cdeb0d9:/opt/airflow# CASS_DRIVER_BUILD_CONCURRENCY=8 pip install cassandra-driver --no-binary ":all:"
Collecting cassandra-driver
  Downloading cassandra-driver-3.29.0.tar.gz (292 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 292.7/292.7 kB 4.5 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Requirement already satisfied: geomet<0.3,>=0.1 in /usr/local/lib/python3.12/site-packages (from cassandra-driver) (0.2.1.post1)
Requirement already satisfied: click in /usr/local/lib/python3.12/site-packages (from geomet<0.3,>=0.1->cassandra-driver) (8.1.7)
Requirement already satisfied: six in /usr/local/lib/python3.12/site-packages (from geomet<0.3,>=0.1->cassandra-driver) (1.16.0)
Building wheels for collected packages: cassandra-driver
  Building wheel for cassandra-driver (setup.py) ... done
  Created wheel for cassandra-driver: filename=cassandra_driver-3.29.0-cp312-cp312-linux_x86_64.whl size=19788637 sha256=303432506fb53a17c48f6fee738a235848f32d1b2bfe929d1cb72254695c1edb
  Stored in directory: /tmp/pip-ephem-wheel-cache-8gq19r54/wheels/f1/6b/0e/cd552adf492c2ef8392e2567f8f829be39ca14f473359e4279
Successfully built cassandra-driver
Installing collected packages: cassandra-driver
Successfully installed cassandra-driver-3.29.0


tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py::TestCassandraToGCS::test_execute PASSED                                                                                                                                                                                     [ 50%]
tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py::TestCassandraToGCS::test_convert_value PASSED

Since it takes quite some time to build cassandra driver and it would complicate our CI builds (I'd have to specifically uninstall/install the driver from sdist - and our users will have to go through the same hoops, I will disable cassandra for Python 3.12 and open an issue in cassandra for the whl not supporting libev. I will re-enable it when the whl is compied with libev support.

@fruch
Copy link

fruch commented Feb 20, 2024

As I said if packages are always installed as wheel is would work out of the box, libev should be baked into the wheel.

If not, and you can enforce libev install before install from pip, it would work as well, see
https://docs.datastax.com/en/developer/python-driver/3.14/installation/#libev-support for more details of need thing to be able to compile from source.

@fruch . Just double checked it and for Python 3.12, whells for linux do not seem to be compiled with libev support. That basically makes python 3.12 cassandra-driver whl useless as it falls back to asyncore that is missing in Python 3.12

When I install it from PyPI with wheels, our test fail:

root@d71d7cdeb0d9:/opt/airflow# apt list libev4
Listing... Done
libev4/stable,now 1:4.33-1 amd64 [installed]
root@d71d7cdeb0d9:/opt/airflow# apt list libev-dev
Listing... Done
libev-dev/stable,now 1:4.33-1 amd64 [installed]

....

Downloading cassandra_driver-3.29.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.6/19.6 MB 12.3 MB/s eta 0:00:00
Installing collected packages: cassandra-driver
Successfully installed cassandra-driver-3.29.0

....
pytest tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py

_________________________________________________________________________________________________________ ERROR collecting tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py _________________________________________________________________________________________________________
tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py:27: in <module>
    from airflow.providers.google.cloud.transfers.cassandra_to_gcs import CassandraToGCSOperator  # noqa: E402
airflow/providers/google/cloud/transfers/cassandra_to_gcs.py:34: in <module>
    from airflow.providers.apache.cassandra.hooks.cassandra import CassandraHook
airflow/providers/apache/cassandra/hooks/cassandra.py:25: in <module>
    from cassandra.cluster import Cluster, Session
cassandra/cluster.py:173: in init cassandra.cluster
    ???
E   cassandra.DependencyException: Unable to load a default connection class
E   The following exceptions were observed:
E    - The C extension needed to use libev was not found.  This probably means that you didn't have the required build dependencies when installing the driver.  See http://datastax.github.io/python-driver/installation.html#c-extensions for instructions on installing build dependencies and building the C extension.
E    - Unable to import asyncore module.  Note that this module has been removed in Python 3.12 so when using the driver with this version (or anything newer) you will need to use one of the other event loop implementations.

I tested that when I build cassandra driver from sdist, the test start to work (so libev is used)

oot@d71d7cdeb0d9:/opt/airflow# pip uninstall cassandra-driver
Found existing installation: cassandra-driver 3.29.0
Uninstalling cassandra-driver-3.29.0:
  Would remove:
    /usr/local/lib/python3.12/site-packages/cassandra/*
    /usr/local/lib/python3.12/site-packages/cassandra_driver-3.29.0.dist-info/*
Proceed (Y/n)? y
  Successfully uninstalled cassandra-driver-3.29.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
root@d71d7cdeb0d9:/opt/airflow# CASS_DRIVER_BUILD_CONCURRENCY=8 pip install cassandra-driver --no-binary ":all:"
Collecting cassandra-driver
  Downloading cassandra-driver-3.29.0.tar.gz (292 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 292.7/292.7 kB 4.5 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Requirement already satisfied: geomet<0.3,>=0.1 in /usr/local/lib/python3.12/site-packages (from cassandra-driver) (0.2.1.post1)
Requirement already satisfied: click in /usr/local/lib/python3.12/site-packages (from geomet<0.3,>=0.1->cassandra-driver) (8.1.7)
Requirement already satisfied: six in /usr/local/lib/python3.12/site-packages (from geomet<0.3,>=0.1->cassandra-driver) (1.16.0)
Building wheels for collected packages: cassandra-driver
  Building wheel for cassandra-driver (setup.py) ... done
  Created wheel for cassandra-driver: filename=cassandra_driver-3.29.0-cp312-cp312-linux_x86_64.whl size=19788637 sha256=303432506fb53a17c48f6fee738a235848f32d1b2bfe929d1cb72254695c1edb
  Stored in directory: /tmp/pip-ephem-wheel-cache-8gq19r54/wheels/f1/6b/0e/cd552adf492c2ef8392e2567f8f829be39ca14f473359e4279
Successfully built cassandra-driver
Installing collected packages: cassandra-driver
Successfully installed cassandra-driver-3.29.0


tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py::TestCassandraToGCS::test_execute PASSED                                                                                                                                                                                     [ 50%]
tests/providers/google/cloud/transfers/test_cassandra_to_gcs.py::TestCassandraToGCS::test_convert_value PASSED

Since it takes quite some time to build cassandra driver and it would complicate our CI builds (I'd have to specifically uninstall/install the driver from sdist - and our users will have to go through the same hoops, I will disable cassandra for Python 3.12 and open an issue in cassandra for the whl not supporting libev. I will re-enable it when the whl is compied with libev support.

Can you try out the scylla-driver, it's the one I maintain, and should be working out of the box with libev by default (as far as I know), it's a drop-in replacement and works with Cassandra and with scylladb

@potiuk
Copy link
Member Author

potiuk commented Feb 20, 2024

Can you try out the scylla-driver, it's the one I maintain, and should be working out of the box with libev by default (as far as I know), it's a drop-in replacement and works with Cassandra and with scylladb

Yes. it seems to work out-of-the box with libev, Is it 100% compatible with cassandra-driver? And what are chances cassandra-driver will get it? We'de rather prefer to have dependency on the 'cassandra` one, unless there are good reasons it's not a good idea?

@fruch
Copy link

fruch commented Feb 20, 2024

Can you try out the scylla-driver, it's the one I maintain, and should be working out of the box with libev by default (as far as I know), it's a drop-in replacement and works with Cassandra and with scylladb

Yes. it seems to work out-of-the box with libev, Is it 100% compatible with cassandra-driver? And what are chances cassandra-driver will get it? We'de rather prefer to have dependency on the 'cassandra` one, unless there are good reasons it's not a good idea?

I would go and open an issue for them, they can bundle it better. they might take a while to address it.

We do grantee compatibility, and I think we can do a quicker turn around on issues, if you or your users gonna run into any.

It's your call, but we are here if you need help with it.

@potiuk
Copy link
Member Author

potiuk commented Feb 20, 2024

I would go and open an issue for them, they can bundle it better. they might take a while to address it.

I did: https://datastax-oss.atlassian.net/jira/software/c/projects/PYTHON/issues/PYTHON-1378

We do grantee compatibility, and I think we can do a quicker turn around on issues, if you or your users gonna run into any.

Out of curiosity - also from the ASF point of view - who are WE vs. THEM?

Seems like both cassandra and scylladn-client have Datastax Copyright (?) see below (and it's a fork of the datastax python-driver) - so I understand they are not part of the ASF PMC work? But Scylla is a separate compamy - somewhat competing with cassandra (?), so I am abit cautious here to use client fom scylladb.

Screenshot 2024-02-20 at 23 13 57

Any comments here for the govenance?

@fruch
Copy link

fruch commented Feb 21, 2024

I would go and open an issue for them, they can bundle it better. they might take a while to address it.

I did: https://datastax-oss.atlassian.net/jira/software/c/projects/PYTHON/issues/PYTHON-1378

We do grantee compatibility, and I think we can do a quicker turn around on issues, if you or your users gonna run into any.

Out of curiosity - also from the ASF point of view - who are WE vs. THEM?

Seems like both cassandra and scylladn-client have Datastax Copyright (?) see below (and it's a fork of the datastax python-driver) - so I understand they are not part of the ASF PMC work? But Scylla is a separate compamy - somewhat competing with cassandra (?), so I am abit cautious here to use client fom scylladb.

Screenshot 2024-02-20 at 23 13 57

Any comments here for the govenance?

Yes it's a fork, hence you still see the original licensing, along side with newer parts which are scylla specific

Yes, scylla drivers are not part of the ASF.

Yes, scylladb and Datastax are competitors.

We still collaborate on the drivers, and try to help each other when possible.

In this specific case of supporting python 3.12, we several communication with the datastax people to see we are mostly align on how to handle asyncore deprecation.

As for the wheels, we are building them with cibuildwheel (relatively new project), while datastax have a private repo for those, so we can't actively help with that.

At the end I'm trying to help, sorry if it comes out like I'm bashing other projects, cassandra community and datastax are doing great work with those drivers.

I totally understand if it won't fit your project or your users, and the preference to stick to code which is part of ASF. (since this ASF project)

@potiuk
Copy link
Member Author

potiuk commented Feb 21, 2024

At the end I'm trying to help, sorry if it comes out like I'm bashing other projects, cassandra community and datastax are doing great work with those drivers.

Absolutely not. I just wanted to understand the situation :). Actually that explanation is a good one for me as I can make a comment on the issue, that we are considering swtiching to scylladb-driver for that.

I totally understand if it won't fit your project or your users, and the preference to stick to code which is part of ASF. (since this ASF project)

Actually - I have no preference here to be honest. Technically speaking cassandra-driver is a Datastax project, not an ASF one - which is on it's own pretty strange, that cassandra python client is not part / owned by ASF and I am going to ask on the devlist so that they consider changing it - I think the name there is pretty problematic, because it suggests it is an ASF project. Both projects have good licences and technically there is no problem with including either. I will see what the devlist discussion brings.

ephraimbuddy pushed a commit that referenced this pull request Feb 22, 2024
The Universal Pathlib provides  Pathlib-like interface for FSSPEC
In 0.1. *It was not very well defined for extension, so the way how we use it for 0.1.*
so we used a lot of private methods and attributes that were not defined in the interface
an they are broken with version 0.2.0 which is much better suited for extension and supports
Python 3.12. We should limit it, unti we migrate to 0.2.0
See: fsspec/universal_pathlib#173 (comment)
This is prerequistite to make Airflow compatible with Python 3.12
Tracked in #36755

(cherry picked from commit 1301274)
ephraimbuddy pushed a commit that referenced this pull request Feb 22, 2024
Some of the providers need to be currently excluded from Python 3.12
because they have conflicting dependencies. While we are working on
Python 3.12 support in #36755, in order to install airflow (for
caching purposes) from GitHub URL, we need to separately merge the
exclusions to main - this will help to build Python 3.12 CI
image with all necessary dependencies cached.

(cherry picked from commit b53fe08)
@potiuk
Copy link
Member Author

potiuk commented Mar 10, 2024

OK. this one is likely going to be green. Waiting for a round of reviews!

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good, I just added some nits, otherwise LGTM

@potiuk
Copy link
Member Author

potiuk commented Mar 10, 2024

Updated and addressed all NITS. Thanks @hussein-awala ! I also saw that we likely run out of disk space when preparing ARM CI images - so I adapted it a bit to run sequentially rather than in parallel, as it seems to build a bit more stable and it's not on critical path of the CI build.

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small finding, one comment. Open for full tests but I assume it will not relveal more than the CI anyway is covering today.

@jscheffl
Copy link
Contributor

jscheffl commented Mar 10, 2024

don't know if it is/was a local issue but when starting on the PR the first time (after update of breeze of course) the command breeze start-airflow --dev-mode --load-example-dags --python 3.12 --backend postgres --executor CeleryExecutor --answer y then the logs where showing (and automatically recovered in a retry starting with success afterwards):

 => [main 23/34] COPY --from=scripts install_airflow.sh /scripts/docker/                                                                                                                 0.2s
 => ERROR [main 24/34] RUN bash /scripts/docker/install_airflow.sh                                                                                                                      32.6s
------
 > importing cache manifest from ghcr.io/apache/airflow/main/ci/python3.12:cache-linux-amd64:
------
------
 > [main 24/34] RUN bash /scripts/docker/install_airflow.sh:
0.426 
0.426 Using 'uv' to install Airflow
0.426 
0.426 PATH=/root/.local/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
0.883 Installed pip: pip 24.0 from /usr/local/lib/python3.12/site-packages/pip (python 3.12): /usr/local/bin/pip
0.883 Using 'uv' to install Airflow
0.898 Installed uv: uv 0.1.15: /usr/local/bin/uv
0.899 
0.899 Installing all packages with constraints. Installation method: .
0.899 
0.899 + uv pip install --python /usr/local/bin/python --editable '.[devel-ci]' --constraint /root/constraints.txt
2.405 Built 1 editable in 1.41s
2.414   × No solution found when resolving dependencies:
2.414   ╰─▶ Because apache-airflow[devel-ci]==2.9.0.dev0 depends on
2.414       jsonschema>=4.18.0 and apache-airflow[devel-ci]==2.9.0.dev0
2.414       depends on jsonschema==4.17.3, we can conclude that
2.414       apache-airflow[devel-ci]==2.9.0.dev0 cannot be used.
2.414       And because you require apache-airflow[devel-ci]==2.9.0.dev0, we can
2.414       conclude that the requirements are unsatisfiable.
2.445 + set +x
2.445 
2.445 Likely pyproject.toml has new dependencies conflicting with constraints.
2.445 
2.445 Falling back to no-constraints, lowest-direct resolution installation.
2.445 
2.446 + uv pip install --python /usr/local/bin/python --upgrade --resolution lowest-direct --editable '.[devel-ci]'
3.941 Built 1 editable in 1.42s
32.47 error: Failed to download and build: zenpy==2.0.32
32.47   Caused by: Failed to build: zenpy==2.0.32
32.47   Caused by: Build backend failed to determine extra requires with `build_wheel()`:
32.47 --- stdout:
32.47 
32.47 --- stderr:
32.47 /tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/dist.py:472: SetuptoolsDeprecationWarning: Invalid dash-separated options
32.47 !!
32.47 
32.47         ********************************************************************************
32.47         Usage of dash-separated 'description-file' will not be supported in future
32.47         versions. Please use the underscore name 'description_file' instead.
32.47 
32.47         By 2024-Sep-26, you need to update your project and remove deprecated calls
32.47         or your builds will no longer be supported.
32.47 
32.47         See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
32.47         ********************************************************************************
32.47 
32.47 !!
32.47   opt = self.warn_dash_deprecation(opt, section)
32.47 /tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/config/_apply_pyprojecttoml.py:76: _MissingDynamic: `license` defined outside of `pyproject.toml` is ignored.
32.47 !!
32.47 
32.47         ********************************************************************************
32.47         The following seems to be defined outside of `pyproject.toml`:
32.47 
32.47         `license = 'GPLv3'`
32.47 
32.47         According to the spec (see the link below), however, setuptools CANNOT
32.47         consider this value unless `license` is listed as `dynamic`.
32.47 
32.47         https://packaging.python.org/en/latest/specifications/pyproject-toml/#declaring-project-metadata-the-project-table
32.47 
32.47         To prevent this problem, you can list `license` under `dynamic` or alternatively
32.47         remove the `[project]` table from your file and rely entirely on other means of
32.47         configuration.
32.47         ********************************************************************************
32.47 
32.47 !!
32.47   _handle_missing_dynamic(dist, project_table)
32.47 /tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/config/_apply_pyprojecttoml.py:76: _MissingDynamic: `keywords` defined outside of `pyproject.toml` is ignored.
32.47 !!
32.47 
32.47         ********************************************************************************
32.47         The following seems to be defined outside of `pyproject.toml`:
32.47 
32.47         `keywords = ['zendesk', 'api', 'wrapper']`
32.47 
32.47         According to the spec (see the link below), however, setuptools CANNOT
32.47         consider this value unless `keywords` is listed as `dynamic`.
32.47 
32.47         https://packaging.python.org/en/latest/specifications/pyproject-toml/#declaring-project-metadata-the-project-table
32.47 
32.47         To prevent this problem, you can list `keywords` under `dynamic` or alternatively
32.47         remove the `[project]` table from your file and rely entirely on other means of
32.47         configuration.
32.47         ********************************************************************************
32.47 
32.47 !!
32.47   _handle_missing_dynamic(dist, project_table)
32.47 /tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/config/_apply_pyprojecttoml.py:76: _MissingDynamic: `dependencies` defined outside of `pyproject.toml` is ignored.
32.47 !!
32.47 
32.47         ********************************************************************************
32.47         The following seems to be defined outside of `pyproject.toml`:
32.47 
32.47         `dependencies = ['requests>=2.14.2', 'python-dateutil>=2.7.5', 'cachetools>=3.1.0', 'pytz>=2018.9', 'six>=1.14.0']`
32.47 
32.47         According to the spec (see the link below), however, setuptools CANNOT
32.47         consider this value unless `dependencies` is listed as `dynamic`.
32.47 
32.47         https://packaging.python.org/en/latest/specifications/pyproject-toml/#declaring-project-metadata-the-project-table
32.47 
32.47         To prevent this problem, you can list `dependencies` under `dynamic` or alternatively
32.47         remove the `[project]` table from your file and rely entirely on other means of
32.47         configuration.
32.47         ********************************************************************************
32.47 
32.47 !!
32.47   _handle_missing_dynamic(dist, project_table)
32.47 Traceback (most recent call last):
32.47   File "<string>", line 14, in <module>
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
32.47     return self._get_build_requires(config_settings, requirements=['wheel'])
32.47            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
32.47     self.run_setup()
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 311, in run_setup
32.47     exec(code, locals())
32.47   File "<string>", line 4, in <module>
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/__init__.py", line 103, in setup
32.47     return distutils.core.setup(**attrs)
32.47            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 159, in setup
32.47     dist.parse_config_files()
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/_virtualenv.py", line 22, in parse_config_files
32.47     result = old_parse_config_files(self, *args, **kwargs)
32.47              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/dist.py", line 627, in parse_config_files
32.47     pyprojecttoml.apply_configuration(self, filename, ignore_option_errors)
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/config/pyprojecttoml.py", line 68, in apply_configuration
32.47     return _apply(dist, config, filepath)
32.47            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/config/_apply_pyprojecttoml.py", line 57, in apply
32.47     _apply_project_table(dist, config, root_dir)
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/config/_apply_pyprojecttoml.py", line 83, in _apply_project_table
32.47     corresp(dist, value, root_dir)
32.47   File "/tmp/.tmpRiPBV6/.tmp4BS1M7/.venv/lib/python3.12/site-packages/setuptools/config/_apply_pyprojecttoml.py", line 184, in _license
32.47     _set_config(dist, "license", val["text"])
32.47                                  ~~~^^^^^^^^
32.47 KeyError: 'text'
32.47 ---
------
Dockerfile.ci:1330
--------------------
 1328 |     # But in cron job we will install latest versions matching pyproject.toml to see if there is no breaking change
 1329 |     # and push the constraints if everything is successful
 1330 | >>> RUN bash /scripts/docker/install_airflow.sh
 1331 |     
 1332 |     COPY --from=scripts entrypoint_ci.sh /entrypoint
--------------------
ERROR: failed to solve: process "/bin/bash -o pipefail -o errexit -o nounset -o nolog -c bash /scripts/docker/install_airflow.sh" did not complete successfully: exit code: 2
Attempting to build with --upgrade-to-newer-dependencies on failure
Using default as context.
default
Current context is now "default"
[+] Building 76.1s (66/66) FINISHED                                                                                                                                            docker:default
 => [internal] load build definition from Dockerfile.ci                                        

@potiuk
Copy link
Member Author

potiuk commented Mar 10, 2024

then the logs where showing (and automatically recovered in a retry starting with success afterwards):

That's expected. First the image will be built with constraints - but since the constraints are not updated for 3.12, it will fail because of conflicts and there is automated attempt in this case to automatically upgrade to newer dependencies (which will succeed this time). All as expected. This should recover itself after teh PR gets merged and constraints will get updated

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After my short tests I can say... works.
There might be glitches when upgrading but complexity and bugs will come... anyway.

Finally after a number of dependency upgrades we seem to be able to
upgrade to Python 3.12 (pending universal_pathlib 0.2.0 conversion)

Several providers are excluded from being installed and wait for
Python 3.12, but it should not block Airlfow's general 3.12 support.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants