Skip to content

Adds support for SQLAlchemy 2.0 and psycopg >=3#79

Merged
icemac merged 24 commits into
zopefoundation:masterfrom
seantis:sqlalchemy-2.0
Jun 1, 2023
Merged

Adds support for SQLAlchemy 2.0 and psycopg >=3#79
icemac merged 24 commits into
zopefoundation:masterfrom
seantis:sqlalchemy-2.0

Conversation

@Daverball
Copy link
Copy Markdown
Contributor

Functionally there doesn't appear to be anything that will make zope.sqlalchemy stop working on 2.0. In fact we've already had it running on 1.6 for a couple of months now with no problems.

I fixed the few tests that would've failed with SQLAlchemy 2.0 so that you can support 2.0 with confidence.

@bouillon
Copy link
Copy Markdown

This works also for me

@icemac
Copy link
Copy Markdown
Member

icemac commented Apr 13, 2023

@Daverball Did the tests run locally? In GitHub Actions they run for 6 hours before they get killed by the timeout.

@icemac
Copy link
Copy Markdown
Member

icemac commented Apr 13, 2023

The 1.4 tests run fine but the 2.0 ones seem to run into a deadlock.

@icemac
Copy link
Copy Markdown
Member

icemac commented Apr 13, 2023

BTW: According to the contributing policies of the zopefoundation organization you need to sign a contributor agreement before any non-trivial change can be merged. For details please consult the Contributing guidelines for zopefoundation projects.

@Daverball
Copy link
Copy Markdown
Contributor Author

The tests did complete successfully for 3.10 and 3.9. I did not bother testing all of the permutations locally.

Considering coverage did complete which is essentially doing the same as the 3.9 tests from what I can tell I can't really say what is causing the deadlock on the CI. It doesn't help that it happens before any log output occurs.

@Daverball
Copy link
Copy Markdown
Contributor Author

I consider this a trivial case, since only the test suite was adjusted slightly, with no changes to production code. signing such a comprehensive agreement for such a small matter is honestly not worth my time or yours.

…s well"

This reverts commit a368db6.

I fumbled the definition of the -f vs -e flag on tox and did not realize
that it will test all permutations for that python version.
@Daverball
Copy link
Copy Markdown
Contributor Author

3.8 tests did complete successfully locally as well. Considering how the SQLAlchemy 2.0 environment is always the last one tox runs anyways I actually can't imagine that the deadlock on the CI has anything to with what I changed. Since before anything it would run the install for earliest SQLAlchemy version and it's not even getting to that point. tox locks up before it has the chance to print even just setting up the first environment.

@icemac
Copy link
Copy Markdown
Member

icemac commented Apr 14, 2023

I consider this a trivial case, since only the test suite was adjusted slightly, with no changes to production code. signing such a comprehensive agreement for such a small matter is honestly not worth my time or yours.

Alternatively for a one-time contribution you might consider to submit your changes in this PR under the ZPL license like it was done in #47 (comment)

@Daverball
Copy link
Copy Markdown
Contributor Author

Daverball commented Apr 14, 2023

@icemac My best guess for as to why tox is hanging only on the CI right now is that it's hitting some kind of resource limit on the CI runner, although that doesn't really explain why the py310 and py311 runners hang, since they both contain fewer environments.

I could try to double the number of version specific runners and split off a couple of environments each using --skip-env on one of the runners while specifying the skipped environments on the other, but it might make the github actions config a bit messy.

Could you trigger a run on master to verify that the tests will still complete successfully, to rule out some sort of regression in a recent release of tox (and/or change in environment on the GH Actions CI runners)? (looks like that's still working fine)

@Daverball
Copy link
Copy Markdown
Contributor Author

Daverball commented Apr 17, 2023

@icemac Looks like the 2.0 environment always hangs on the CI, even if it is the only one, but I'm at a complete loss as to why, since the only difference is the SQLAlchemy version pin. It's possible 2.0 is trying to build some C extensions that hang on GitHub Actions, but I would expect to see more intermediary command output if that were actually the case...

I'm also getting some doctest failures on the CI for older SQLAlchemy versions, which I cannot reproduce locally. Do you have any ideas as to why the CI environment would behave so drastically differently?

This is the workflow run with split off environments for reference: https://github.com/seantis/zope.sqlalchemy/actions/runs/4722729734/jobs/8377715445

@icemac
Copy link
Copy Markdown
Member

icemac commented Apr 18, 2023

@Daverball I would like to have added the following change to the PR to see where the tests hang – but I am not allowed to write to your branch (seems you did not allow contributor changes), so could you please do this change?:

diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
index 8f22689..ba688b0 100644
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -61,7 +61,7 @@ jobs:
         pip install tox
     - name: Test
       run: |
-        tox -f ${{ matrix.config[1] }}
+        tox -f ${{ matrix.config[1] }} -- -vv
     - name: Coverage
       if: matrix.config[1] == 'coverage'
       run: |

@Daverball
Copy link
Copy Markdown
Contributor Author

There wasn't really an option for me to allow contributor changes, otherwise I would have enabled it. I have it enabled on all my other pull requests. I was planning to make a new branch and play around with the verbosity options anyways. If this isn't verbose enough to identify the problem I will play around some more.

@Daverball
Copy link
Copy Markdown
Contributor Author

Daverball commented Apr 18, 2023

Ok I think it might actually be GitHub Actions that's causing the output to disappear once it hangs. I followed one of the runners until it got stuck. When I switched to another runner the other run was not showing any output and when I switched back to the one I was following its output had disappeared as well. So looks like the only way to debug is to trigger the action and then follow the output. It looks like testAbortBeforeCommit hangs on the CI with SQLAlchemy 2.0 and testpg, since the first environment in the pyX factor will always be without a version pin.

After cancelling the run, the output appears to have shown up again: https://github.com/seantis/zope.sqlalchemy/actions/runs/4729467500/jobs/8392015954

@Daverball
Copy link
Copy Markdown
Contributor Author

Alright, I see, I didn't look close enough at the tox.ini. I thought testpg was running locally as well. So that's the difference, if I flip the check and force testpg and testpg2 to run with a Postgres database configured locally, then I can reproduce the hang locally. It looks like the hang happens in Postgres itself, because once the test hangs and I abort and then run the tests on a different version, it will hang on the very first test until I restart Postgres.

However, in contrast to the CI, all SQLAlchemy versions will hang on the same test for me, even if I force restart Postgres. So I cannot reproduce the CI behavior for those versions. Even after removing the zope_sqlalchemy_tests database and recreating it, it will always hang on testAbortBeforeCommit (or possibly the next test after, if the print happens after the test has been run successfully)

We run zope.sqlalchemy 1.6 (which didn't have the <2 SQLAlchemy version pin yet) on a production database with Postgres and psycopg2 without issue. It's possible we're just never hitting the corner case that would trigger this hang, but that doesn't really explain the inconsistent behavior of older versions between running it locally and on the CI.

@Daverball
Copy link
Copy Markdown
Contributor Author

Daverball commented Apr 18, 2023

For science I tried the new psycopg3 backend on SQLAlchemy 2.0 and it also hangs on the same test, but I am getting a little bit more of a helpful traceback when I Ctrl+C to abort the hanging tests. Looks like drop_all in tearDown is what hangs, so something about this test creates a lock on one of the tables without ever releasing it, which prevents it from being dropped, so the test just waits indefinitely for the table to be dropped in the tearDown.

I'm guessing the server side transaction never gets aborted/committed somehow.

@Daverball
Copy link
Copy Markdown
Contributor Author

@icemac It looks like any test that uses more than one commit will hang, I assume the second commit is bound to the wrong server side Postgres transaction, so it never actually gets commited, which will prevent the test tearDown from doing its things. I will dive into the code to try to see if I can make this more robust somehow.

@Daverball
Copy link
Copy Markdown
Contributor Author

@icemac Alright I think I figured out the problem, it wasn't the two commits, it's that in SQLAlchemy 2.0 bare engine/connects have transactions as well now without autocommit behavior, so when tests utilize the engine directly to get a new connection, that connection starts a transaction and never gets closed/commited, so we need to wrap it inside a with conn.begin().

The cleaner way to do this is probably to get a new connection in testSetUp and close it in tearDown, so the tests that need a connection outside the transaction/session machinery can just do a with self.conn.begin(): this should be backwards compatible with 1.1 and actually prevents the tests from hanging in 2.0.

Now there is one failure left in 2.0 and it is testNestedSessionCommitAllowed. Looks like for some reason session.in_nested_transaction() is False in 2.0 by the time the before_commit hook gets invoked, it does still resolve to True before session.commit is called though. So it looks like session.commit is unwinding the nested transactions before the hook gets invoked.

@Daverball
Copy link
Copy Markdown
Contributor Author

Daverball commented Apr 20, 2023

@icemac Can you explain to me why session.commit is fine for a nested transaction but not for a non-nested transaction? session.commit will still commit the outermost transaction as well, so if that's what you wanted to avoid, that's not what you are doing. Generally nested transactions should make use of the context manager or call commit/ rollback etc on the savepoint itself that was returned by begin_nested.

Edit: Looking at the git blame it looks like this exception was added as a result of #1 which incorrectly assumed how the subtransaction mechanism in SQLAlchemy works. session.begin_nested(); session.commit() is not really a pattern, because it does not only commit the nested transaction, but the parent transaction as well... So I think it is fine to make that illegal again and force people to use savepoint = session.begin_nested(); savepoint.commit() in third party code that is not aware of the savepoint mechanism in transaction.

Cleans up version detection to match the one in `datamanager.py`
Removes increased verbosity
Make running postgres tests locally a bit easier using tox. They can be
invoked executing tox in the following way: `TEST_PG=true tox`

Fixes `ResourceWarning` in psycopg v3 backend.

Uses `self.skipTest` whenever appropriate instead of `return`.
@Daverball
Copy link
Copy Markdown
Contributor Author

@icemac I managed to fix the ResourceWarning in psycopg v3 and i updated the changelog. I am happy with the current state of the PR now, so feel free to start your review, as soon as you find yourself with enough time to do so. Once the changes have been approved I will comment with the declaration and a create a snapshot of the PR with the internet archive, so you can merge it.

Copy link
Copy Markdown
Member

@icemac icemac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into a personal timeout reviewing the PR, so here comes the first part of my review, next part(s) will follow later.

Comment thread CHANGES.rst
Comment thread src/zope/sqlalchemy/datamanager.py
Comment thread src/zope/sqlalchemy/tests.py Outdated
Marks breaking change in change log
Adds comment to clarify psycopg v3 retryable errors
@Daverball
Copy link
Copy Markdown
Contributor Author

@icemac Just pinging you again, in case this slipped through the cracks.

Copy link
Copy Markdown
Member

@icemac icemac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the reminder, this PR got buried under many newer ones.

There are some minor change requests open but I think it generally looks well.

Comment thread src/zope/sqlalchemy/datamanager.py Outdated
Comment thread src/zope/sqlalchemy/datamanager.py Outdated
Comment thread src/zope/sqlalchemy/tests.py
Comment thread tox.ini Outdated
Co-authored-by: Michael Howitz <icemac@gmx.net>
@icemac
Copy link
Copy Markdown
Member

icemac commented May 22, 2023

It seems we need packaging as new dependency.
To get rid of the problems in the lint job, you can run tox -e isort-apply

@Daverball
Copy link
Copy Markdown
Contributor Author

It seems we need packaging as new dependency. To get rid of the problems in the lint job, you can run tox -e isort-apply

Both of these issues have been fixed.

@Daverball Daverball requested a review from icemac May 22, 2023 11:20
Copy link
Copy Markdown
Member

@icemac icemac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still two little open points, I'd request to change.

Comment thread setup.py
Comment thread tox.ini Outdated
Re-includes tests in coverage.
Drops required coverage percentage.
@Daverball Daverball requested a review from icemac May 23, 2023 06:34
Copy link
Copy Markdown
Member

@icemac icemac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the PR is polished enough to be merged.

Thank you for your patience.

The only thing left before the merge is licencing your work, until then I'll add the "do not merge" label.

@icemac icemac added the do not merge not (yet) ready to be merged label May 24, 2023
@Daverball
Copy link
Copy Markdown
Contributor Author

@icemac I explicitly submit the contributions in this PR under the existing ZPL License.

@Daverball
Copy link
Copy Markdown
Contributor Author

And here is a snapshot with the declaration on the internet archive: http://web.archive.org/web/20230524070616/https://github.com/zopefoundation/zope.sqlalchemy/pull/79

@Daverball Daverball requested a review from icemac May 31, 2023 20:23
Copy link
Copy Markdown
Member

@icemac icemac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your hard work to get the issues fixed, the features implemented and the wishes of the release manager fulfilled.

@icemac icemac removed the do not merge not (yet) ready to be merged label Jun 1, 2023
@icemac icemac merged commit 0556935 into zopefoundation:master Jun 1, 2023
@icemac
Copy link
Copy Markdown
Member

icemac commented Jun 1, 2023

I just released https://pypi.org/project/zope.sqlalchemy/3.0/ including these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants