Skip to content

Conversation

@lgray
Copy link
Contributor

@lgray lgray commented Mar 12, 2025

  • ci
  • wheels
  • tests

Will add wheels when they're fixed up by #283

@lgray
Copy link
Contributor Author

lgray commented Mar 12, 2025

Ah - right we are blocked because pydantic is not updated to be compatible with 3.13t at all.

Nope it's in beta now!

@lgray lgray changed the title add freethreaded python ci and wheels build: add freethreaded python ci and wheels Mar 12, 2025
@lgray lgray closed this Mar 17, 2025
@lgray lgray reopened this Mar 17, 2025
@lgray
Copy link
Contributor Author

lgray commented Mar 17, 2025

refreshing this PR since there's now beta2 of the freethreaded pydantic

@lgray
Copy link
Contributor Author

lgray commented Mar 17, 2025

@mgorny @nsmith- @henryiii

I noticed when it's doing the freethreaded windows build that the MS linker is looking for python313.lib, rather than python313t.lib that's specified as PythonLib in CMake.

Results in the error (I think):

LINK : fatal error LNK1104: cannot open file 'python313.lib' [C:\Users\runneradmin\AppData\Local\Temp\tmpu4rnegbz\build\_core.vcxproj]

I think that's the only blocker on windows.

@mgorny
Copy link
Contributor

mgorny commented Mar 18, 2025

That looks like https://gitlab.kitware.com/cmake/cmake/-/issues/26016, which is supposedly fixed but I'm not sure in which CMake version.

@mgorny
Copy link
Contributor

mgorny commented Mar 18, 2025

That looks like https://gitlab.kitware.com/cmake/cmake/-/issues/26016, which is supposedly fixed but I'm not sure in which CMake version.

Ah, sorry, now I see 3.30.3 — so supposedly it should work here.

@mgorny
Copy link
Contributor

mgorny commented Mar 18, 2025

Sorry for thinking loudly. I see that CMake is doing the right thing:

2025-03-17T20:15:06.0037191Z   -- Found PythonInterp: C:/hostedtoolcache/windows/Python/3.13.1/x64-freethreaded/python.exe (found suitable version "3.13.1", minimum required is "3.7")
2025-03-17T20:15:06.0038736Z   -- Found PythonLibs: C:/hostedtoolcache/windows/Python/3.13.1/x64-freethreaded/libs/python313t.lib

So looks like python313.lib is coming from elsewhere. pybind11 perhaps?

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

No problem - think out loud all you want. I'm just not sure where to look :-)

@henryiii
Copy link

You need to use the modern FindPython, not the old one. FindPythonLibs / FindPythonInterp was "removed" (sort of) in CMake 3.27, so that's not what 3.30.3 is referring to.

@henryiii
Copy link

Right above here

include(FetchContent)
FetchContent_Declare(pybind11
SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/pybind11
CMAKE_ARGS "-DBUILD_TESTING=OFF -DPYBIND11_NOPYTHON=ON"
)
FetchContent_MakeAvailable(pybind11)
you should set(PYBIND11_FINDPYTHON ON). pybind11 3.0 will change the default.

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

Thanks @henryiii, giving it a try.

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

3.13t manylinux wheels are dying on tests because of missing awkward 3.13t wheel. Otherwise they build fine!

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

@henryiii I guess the best way to test this in freethreaded mode for races and such would to first just try pytest-parallel? Probably mark the dask tests to not be run in parallel?

@henryiii
Copy link

Yes, I believe so.

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

Nice - parallel tests seem to go, except on windows. Looks like some issue with parallel processing on windows to begin with.

@lgray lgray closed this Mar 19, 2025
@lgray lgray reopened this Mar 19, 2025
@lgray lgray closed this Mar 19, 2025
@lgray lgray reopened this Mar 19, 2025
@lgray lgray closed this Apr 3, 2025
@lgray lgray reopened this Apr 3, 2025
@lgray
Copy link
Contributor Author

lgray commented Apr 4, 2025

I don't really understand this failure in macos.

Must be a threadsafety thing, but then why not in ubuntu?

@lgray
Copy link
Contributor Author

lgray commented Apr 4, 2025

Oh wow, it's definitely a thread safety issue (it passed in this latest commit!). Yikes!

@lgray
Copy link
Contributor Author

lgray commented Apr 4, 2025

@nsmith- Any first ideas on what needs a mutex around it?

@lgray
Copy link
Contributor Author

lgray commented Apr 4, 2025

For reference the error that crops up in the MT tests is:

self = <correctionlib.highlevel.CorrectionSet object at 0x2a9ea3d0150>

    def __iter__(self) -> Iterator[str]:
>       return iter(self._base)
E       TypeError: Object of type 'iterator' is not an instance of 'iterator'

/Library/Frameworks/PythonT.framework/Versions/3.13/lib/python3.13t/site-packages/correctionlib/highlevel.py:401: TypeError
________________________________ test_evaluator ________________________________

    def test_evaluator():
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string("{")
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string("{}")
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string('{"schema_version": "blah"}')
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string('{"schema_version": 2, "description": 3}')
    
        cset = core.CorrectionSet.from_string(
            '{"schema_version": 2, "description": "something", "corrections": []}'
        )
        assert cset.schema_version == 2
        assert cset.description == "something"
    
        cset = wrap(
            schema.Correction(
                name="test corr",
                version=2,
                inputs=[],
                output=schema.Variable(name="a scale", type="real"),
                data=1.234,
            )
        )
>       assert set(cset) == {"test corr"}
E       TypeError: Object of type 'iterator' is not an instance of 'iterator'

tests/test_core.py:47: TypeError

@nsmith-
Copy link
Collaborator

nsmith- commented Apr 15, 2025

My best guess would be something related to this:

correctionlib/src/python.cc

Lines 106 to 108 in 093ce46

.def("__iter__", [](const CorrectionSet &v) {
return py::make_key_iterator(v.begin(), v.end());
}, py::keep_alive<0, 1>())

is not threadsafe.

@lgray
Copy link
Contributor Author

lgray commented Apr 18, 2025

Digging around there seems to be something suspicious w.r.t. the py::keep_alive but I've not found an accurate description. make_key_iterator itself seems to be fine.

Will continue digging.

@nsmith-
Copy link
Collaborator

nsmith- commented Nov 1, 2025

@lgray there are a few unrelated improvements, would you be willing to spin those off to a new PR?

@ikrommyd
Copy link
Contributor

ikrommyd commented Jan 21, 2026

With a thread-sanitized python build, you can see there is a data race. I don't understand yet whether this is a cpython bug or a pybind11. Here is the log:

❯ pytest tests/test_core.py --parallel-threads 4
========================================================================== test session starts ===========================================================================
platform darwin -- Python 3.14.2+, pytest-9.0.2, pluggy-1.6.0
rootdir: /Users/iason/Dropbox/work/pyhep_dev/correctionlib
configfile: pyproject.toml
plugins: run-parallel-0.8.2
collected 6 items
Collected 6 items to run in parallel

tests/test_core.py e·····                                                                                                                                          [100%]

================================================================================= ERRORS =================================================================================
____________________________________________________________________ ERROR at call of test_evaluator _____________________________________________________________________

    def test_evaluator():
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string("{")

        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string("{}")

        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string('{"schema_version": "blah"}')

        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string('{"schema_version": 2, "description": 3}')

        cset = core.CorrectionSet.from_string(
            '{"schema_version": 2, "description": "something", "corrections": []}'
        )
        assert cset.schema_version == 2
        assert cset.description == "something"

        cset = wrap(
            schema.Correction(
                name="test corr",
                version=2,
                inputs=[],
                output=schema.Variable(name="a scale", type="real"),
                data=1.234,
            )
        )
>       assert set(cset) == {"test corr"}
               ^^^^^^^^^
E       TypeError: Object of type 'iterator' is not an instance of 'iterator'

tests/test_core.py:47: TypeError
-------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------
==================
WARNING: ThreadSanitizer: data race (pid=13244)
  Write of size 8 at 0x00030a2a0100 by thread T3:
    #0 update_one_slot typeobject.c:11335 (libpython3.14td.dylib:arm64+0x2efaf8)
    #1 update_slots_callback typeobject.c:11348 (libpython3.14td.dylib:arm64+0x2ef2f4)
    #2 update_subclasses typeobject.c:11527 (libpython3.14td.dylib:arm64+0x2ef214)
    #3 update_slot typeobject.c:11386 (libpython3.14td.dylib:arm64+0x2eeda0)
    #4 type_update_dict typeobject.c:6152 (libpython3.14td.dylib:arm64+0x2ee708)
    #5 type_setattro typeobject.c:6238 (libpython3.14td.dylib:arm64+0x2e5ea0)
    #6 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x62930)
    #7 PyObject_SetAttr object.c:1453 (libpython3.14td.dylib:arm64+0x275b8c)
    #8 <null> <null> (_core.cpython-314td-darwin.so:arm64+0xf39c)
    #9 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x47a08)
    #10 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x144b8)
    #11 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x144b8)
    #12 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x144b8)
    #13 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x59324)
    #14 cfunction_call methodobject.c:564 (libpython3.14td.dylib:arm64+0x266588)
    #15 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #16 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x18e974)
    #17 method_vectorcall classobject.c:73 (libpython3.14td.dylib:arm64+0x18ca4c)
    #18 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x2ebeb8)
    #19 _PyObject_CallNoArgs pycore_call.h:185 (libpython3.14td.dylib:arm64+0x2ebd2c)
    #20 call_unbound_noarg typeobject.c:2904 (libpython3.14td.dylib:arm64+0x2eb870)
    #21 maybe_call_special_no_args typeobject.c:3014 (libpython3.14td.dylib:arm64+0x2de4e8)
    #22 slot_tp_iter typeobject.c:10410 (libpython3.14td.dylib:arm64+0x309654)
    #23 PyObject_GetIter abstract.c:2818 (libpython3.14td.dylib:arm64+0x14afec)
    #24 set_update_iterable_lock_held setobject.c:1028 (libpython3.14td.dylib:arm64+0x2c6830)
    #25 set_update_local setobject.c:1079 (libpython3.14td.dylib:arm64+0x2c6b00)
    #26 make_new_set setobject.c:1159 (libpython3.14td.dylib:arm64+0x2bf7ac)
    #27 set_vectorcall setobject.c:2483 (libpython3.14td.dylib:arm64+0x2bf128)
    #28 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x184f9c)
    #29 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #30 _PyEval_EvalFrameDefault generated_cases.c.h:1622 (libpython3.14td.dylib:arm64+0x4806e4)
    #31 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #32 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #33 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #34 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x18e994)
    #35 method_vectorcall classobject.c:73 (libpython3.14td.dylib:arm64+0x18ca4c)
    #36 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x576334)
    #37 context_run context.c:722 (libpython3.14td.dylib:arm64+0x575f0c)
    #38 method_vectorcall_FASTCALL_KEYWORDS descrobject.c:421 (libpython3.14td.dylib:arm64+0x1ac6ac)
    #39 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x184f9c)
    #40 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #41 _PyEval_EvalFrameDefault generated_cases.c.h:1622 (libpython3.14td.dylib:arm64+0x4806e4)
    #42 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #43 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #44 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #45 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x18e994)
    #46 method_vectorcall classobject.c:73 (libpython3.14td.dylib:arm64+0x18ca4c)
    #47 _PyVectorcall_Call call.c:273 (libpython3.14td.dylib:arm64+0x186560)
    #48 _PyObject_Call call.c:348 (libpython3.14td.dylib:arm64+0x18687c)
    #49 PyObject_Call call.c:373 (libpython3.14td.dylib:arm64+0x186998)
    #50 thread_run _threadmodule.c:359 (libpython3.14td.dylib:arm64+0x7b4b44)
    #51 pythread_wrapper thread_pthread.h:242 (libpython3.14td.dylib:arm64+0x6887a0)

  Previous read of size 8 at 0x00030a2a0100 by thread T4:
    #0 PyIter_Check abstract.c:2853 (libpython3.14td.dylib:arm64+0x14d1f0)
    #1 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x47e60)
    #2 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x46aa8)
    #3 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x144b8)
    #4 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x144b8)
    #5 <null> <null> (_core.cpython-314td-darwin.so:arm64+0x59324)
    #6 cfunction_call methodobject.c:564 (libpython3.14td.dylib:arm64+0x266588)
    #7 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #8 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x18e974)
    #9 method_vectorcall classobject.c:73 (libpython3.14td.dylib:arm64+0x18ca4c)
    #10 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x2ebeb8)
    #11 _PyObject_CallNoArgs pycore_call.h:185 (libpython3.14td.dylib:arm64+0x2ebd2c)
    #12 call_unbound_noarg typeobject.c:2904 (libpython3.14td.dylib:arm64+0x2eb870)
    #13 maybe_call_special_no_args typeobject.c:3014 (libpython3.14td.dylib:arm64+0x2de4e8)
    #14 slot_tp_iter typeobject.c:10410 (libpython3.14td.dylib:arm64+0x309654)
    #15 PyObject_GetIter abstract.c:2818 (libpython3.14td.dylib:arm64+0x14afec)
    #16 set_update_iterable_lock_held setobject.c:1028 (libpython3.14td.dylib:arm64+0x2c6830)
    #17 set_update_local setobject.c:1079 (libpython3.14td.dylib:arm64+0x2c6b00)
    #18 make_new_set setobject.c:1159 (libpython3.14td.dylib:arm64+0x2bf7ac)
    #19 set_vectorcall setobject.c:2483 (libpython3.14td.dylib:arm64+0x2bf128)
    #20 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x184f9c)
    #21 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #22 _PyEval_EvalFrameDefault generated_cases.c.h:1622 (libpython3.14td.dylib:arm64+0x4806e4)
    #23 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #24 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #25 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #26 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x18e994)
    #27 method_vectorcall classobject.c:73 (libpython3.14td.dylib:arm64+0x18ca4c)
    #28 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x576334)
    #29 context_run context.c:722 (libpython3.14td.dylib:arm64+0x575f0c)
    #30 method_vectorcall_FASTCALL_KEYWORDS descrobject.c:421 (libpython3.14td.dylib:arm64+0x1ac6ac)
    #31 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x184f9c)
    #32 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #33 _PyEval_EvalFrameDefault generated_cases.c.h:1622 (libpython3.14td.dylib:arm64+0x4806e4)
    #34 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #35 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #36 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #37 _PyObject_VectorcallTstate pycore_call.h:169 (libpython3.14td.dylib:arm64+0x18e994)
    #38 method_vectorcall classobject.c:73 (libpython3.14td.dylib:arm64+0x18ca4c)
    #39 _PyVectorcall_Call call.c:273 (libpython3.14td.dylib:arm64+0x186560)
    #40 _PyObject_Call call.c:348 (libpython3.14td.dylib:arm64+0x18687c)
    #41 PyObject_Call call.c:373 (libpython3.14td.dylib:arm64+0x186998)
    #42 thread_run _threadmodule.c:359 (libpython3.14td.dylib:arm64+0x7b4b44)
    #43 pythread_wrapper thread_pthread.h:242 (libpython3.14td.dylib:arm64+0x6887a0)

  Thread T3 (tid=38160489, running) created by main thread at:
    #0 pthread_create <null> (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2f708)
    #1 do_start_joinable_thread thread_pthread.h:289 (libpython3.14td.dylib:arm64+0x686bb0)
    #2 PyThread_start_joinable_thread thread_pthread.h:331 (libpython3.14td.dylib:arm64+0x6868e0)
    #3 ThreadHandle_start _threadmodule.c:445 (libpython3.14td.dylib:arm64+0x7b4608)
    #4 do_start_new_thread _threadmodule.c:1868 (libpython3.14td.dylib:arm64+0x7b3da8)
    #5 thread_PyThread_start_joinable_thread _threadmodule.c:1991 (libpython3.14td.dylib:arm64+0x7b2ae0)
    #6 cfunction_call methodobject.c:564 (libpython3.14td.dylib:arm64+0x266588)
    #7 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #8 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #9 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #10 _PyEval_EvalFrameDefault generated_cases.c.h:3230 (libpython3.14td.dylib:arm64+0x49a174)
    #11 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #12 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #13 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #14 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #15 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #16 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #17 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #18 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #19 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #20 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #21 _PyEval_EvalFrameDefault generated_cases.c.h:2962 (libpython3.14td.dylib:arm64+0x496170)
    #22 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #23 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #24 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #25 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #26 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #27 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #28 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #29 _PyObject_Call call.c:361 (libpython3.14td.dylib:arm64+0x1868f8)
    #30 PyObject_Call call.c:373 (libpython3.14td.dylib:arm64+0x186998)
    #31 _PyEval_EvalFrameDefault generated_cases.c.h:2657 (libpython3.14td.dylib:arm64+0x48ff80)
    #32 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #33 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #34 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #35 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #36 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #37 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #38 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #39 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #40 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #41 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #42 _PyEval_EvalFrameDefault generated_cases.c.h:2962 (libpython3.14td.dylib:arm64+0x496170)
    #43 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #44 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #45 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #46 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #47 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #48 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #49 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #50 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #51 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #52 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #53 _PyEval_EvalFrameDefault generated_cases.c.h:2962 (libpython3.14td.dylib:arm64+0x496170)
    #54 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #55 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #56 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #57 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #58 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #59 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #60 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #61 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #62 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #63 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #64 _PyEval_EvalFrameDefault generated_cases.c.h:2962 (libpython3.14td.dylib:arm64+0x496170)
    #65 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #66 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #67 PyEval_EvalCode ceval.c:975 (libpython3.14td.dylib:arm64+0x46805c)
    #68 run_eval_code_obj pythonrun.c:1365 (libpython3.14td.dylib:arm64+0x64e448)
    #69 run_mod pythonrun.c:1459 (libpython3.14td.dylib:arm64+0x64e040)
    #70 pyrun_file pythonrun.c:1293 (libpython3.14td.dylib:arm64+0x64af8c)
    #71 _PyRun_SimpleFileObject pythonrun.c:521 (libpython3.14td.dylib:arm64+0x649f1c)
    #72 _PyRun_AnyFileObject pythonrun.c:81 (libpython3.14td.dylib:arm64+0x6498ac)
    #73 pymain_run_file_obj main.c:410 (libpython3.14td.dylib:arm64+0x6b3dac)
    #74 pymain_run_file main.c:429 (libpython3.14td.dylib:arm64+0x6b30c8)
    #75 pymain_run_python main.c:694 (libpython3.14td.dylib:arm64+0x6b2214)
    #76 Py_RunMain main.c:775 (libpython3.14td.dylib:arm64+0x6b1ce4)
    #77 pymain_main main.c:805 (libpython3.14td.dylib:arm64+0x6b243c)
    #78 Py_BytesMain main.c:829 (libpython3.14td.dylib:arm64+0x6b24b8)
    #79 main python.c:15 (python3.14:arm64+0x100000738)

  Thread T4 (tid=38160490, running) created by main thread at:
    #0 pthread_create <null> (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2f708)
    #1 do_start_joinable_thread thread_pthread.h:289 (libpython3.14td.dylib:arm64+0x686bb0)
    #2 PyThread_start_joinable_thread thread_pthread.h:331 (libpython3.14td.dylib:arm64+0x6868e0)
    #3 ThreadHandle_start _threadmodule.c:445 (libpython3.14td.dylib:arm64+0x7b4608)
    #4 do_start_new_thread _threadmodule.c:1868 (libpython3.14td.dylib:arm64+0x7b3da8)
    #5 thread_PyThread_start_joinable_thread _threadmodule.c:1991 (libpython3.14td.dylib:arm64+0x7b2ae0)
    #6 cfunction_call methodobject.c:564 (libpython3.14td.dylib:arm64+0x266588)
    #7 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #8 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #9 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #10 _PyEval_EvalFrameDefault generated_cases.c.h:3230 (libpython3.14td.dylib:arm64+0x49a174)
    #11 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #12 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #13 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #14 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #15 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #16 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #17 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #18 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #19 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #20 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #21 _PyEval_EvalFrameDefault generated_cases.c.h:2962 (libpython3.14td.dylib:arm64+0x496170)
    #22 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #23 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #24 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #25 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #26 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #27 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #28 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #29 _PyObject_Call call.c:361 (libpython3.14td.dylib:arm64+0x1868f8)
    #30 PyObject_Call call.c:373 (libpython3.14td.dylib:arm64+0x186998)
    #31 _PyEval_EvalFrameDefault generated_cases.c.h:2657 (libpython3.14td.dylib:arm64+0x48ff80)
    #32 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #33 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #34 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #35 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #36 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #37 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #38 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #39 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #40 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #41 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #42 _PyEval_EvalFrameDefault generated_cases.c.h:2962 (libpython3.14td.dylib:arm64+0x496170)
    #43 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #44 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #45 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #46 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #47 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #48 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #49 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #50 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #51 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #52 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #53 _PyEval_EvalFrameDefault generated_cases.c.h:2962 (libpython3.14td.dylib:arm64+0x496170)
    #54 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #55 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #56 _PyFunction_Vectorcall call.c:413 (libpython3.14td.dylib:arm64+0x186cac)
    #57 _PyObject_VectorcallDictTstate call.c:146 (libpython3.14td.dylib:arm64+0x185290)
    #58 _PyObject_Call_Prepend call.c:504 (libpython3.14td.dylib:arm64+0x18724c)
    #59 call_method typeobject.c:2937 (libpython3.14td.dylib:arm64+0x2f07e0)
    #60 slot_tp_call typeobject.c:10254 (libpython3.14td.dylib:arm64+0x2f0634)
    #61 _PyObject_MakeTpCall call.c:242 (libpython3.14td.dylib:arm64+0x185694)
    #62 _PyObject_VectorcallTstate pycore_call.h:167 (libpython3.14td.dylib:arm64+0x184f7c)
    #63 PyObject_Vectorcall call.c:327 (libpython3.14td.dylib:arm64+0x186704)
    #64 _PyEval_EvalFrameDefault generated_cases.c.h:2962 (libpython3.14td.dylib:arm64+0x496170)
    #65 _PyEval_EvalFrameDefault ceval.c:1119 (libpython3.14td.dylib:arm64+0x468974)
    #66 _PyEval_Vector ceval.c:2083 (libpython3.14td.dylib:arm64+0x4683fc)
    #67 PyEval_EvalCode ceval.c:975 (libpython3.14td.dylib:arm64+0x46805c)
    #68 run_eval_code_obj pythonrun.c:1365 (libpython3.14td.dylib:arm64+0x64e448)
    #69 run_mod pythonrun.c:1459 (libpython3.14td.dylib:arm64+0x64e040)
    #70 pyrun_file pythonrun.c:1293 (libpython3.14td.dylib:arm64+0x64af8c)
    #71 _PyRun_SimpleFileObject pythonrun.c:521 (libpython3.14td.dylib:arm64+0x649f1c)
    #72 _PyRun_AnyFileObject pythonrun.c:81 (libpython3.14td.dylib:arm64+0x6498ac)
    #73 pymain_run_file_obj main.c:410 (libpython3.14td.dylib:arm64+0x6b3dac)
    #74 pymain_run_file main.c:429 (libpython3.14td.dylib:arm64+0x6b30c8)
    #75 pymain_run_python main.c:694 (libpython3.14td.dylib:arm64+0x6b2214)
    #76 Py_RunMain main.c:775 (libpython3.14td.dylib:arm64+0x6b1ce4)
    #77 pymain_main main.c:805 (libpython3.14td.dylib:arm64+0x6b243c)
    #78 Py_BytesMain main.c:829 (libpython3.14td.dylib:arm64+0x6b24b8)
    #79 main python.c:15 (python3.14:arm64+0x100000738)

SUMMARY: ThreadSanitizer: data race typeobject.c:11335 in update_one_slot
==================
*********************************************************************** pytest-run-parallel report ***********************************************************************
All tests were run in parallel! 🎉
======================================================================== short test summary info =========================================================================
PARALLEL FAILED tests/test_core.py::test_evaluator - TypeError: Object of type 'iterator' is not an instance of 'iterator'
====================================================================== 5 passed, 1 error in 20.08s =======================================================================
ThreadSanitizer: reported 1 warnings
[1]    13244 abort      pytest tests/test_core.py --parallel-threads 4

@ikrommyd
Copy link
Contributor

ikrommyd commented Jan 21, 2026

I have a feeling that it has something to do with when the iterator type is being created? Pybind11 creates the iterator lazily?
Adding this to force the iterator type creation at module init fixed it. cc @henryiii

diff --git a/src/python.cc b/src/python.cc
index f7ffb0d..71d9475 100644
--- a/src/python.cc
+++ b/src/python.cc
@@ -85,7 +85,7 @@ namespace {
     return output;
   }
 }
-PYBIND11_MODULE(_core, m) {
+PYBIND11_MODULE(_core, m, py::mod_gil_not_used()) {
     m.doc() = "python binding for corrections evaluator";
 
     py::class_<Variable>(m, "Variable")
@@ -183,4 +183,10 @@ PYBIND11_MODULE(_core, m) {
       .value("ACOSH", FormulaAst::UnaryOp::Acosh)
       .value("ASINH", FormulaAst::UnaryOp::Asinh)
       .value("ATANH", FormulaAst::UnaryOp::Atanh);
+
+    {
+        auto dummy = CorrectionSet::from_string(R"({"schema_version": 2, "corrections": []})");
+        auto it = py::make_key_iterator(dummy->begin(), dummy->end());
+        (void)it;
+    }
 }

@lgray
Copy link
Contributor Author

lgray commented Jan 21, 2026

I've added a static mutex and a lock guard here, it's not (or shouldn't be) performance critical code. It's a bit of a sledgehammer but maybe it works.

@lgray
Copy link
Contributor Author

lgray commented Jan 21, 2026

A bunch of stuff to clean up too since this was a heavy rebasing of the changes. We'll see how far it gets.

@lgray
Copy link
Contributor Author

lgray commented Jan 21, 2026

Oh it's some initialization thing? weird! Let's see if the global mutex fixes too.

@lgray
Copy link
Contributor Author

lgray commented Jan 21, 2026

@henryiii could you quickly explain what's going on here such that a global dummy initialization makes things work? I am happy for fewer locks but it's a bit too much faith-based programming for me.

@lgray
Copy link
Contributor Author

lgray commented Jan 21, 2026

The mutex approach also seems to fix it. I suppose, if it's a one-time initialization thing in pybind I could change it to use an std::atomic<bool> for the first time we do it and go completely lockless, while also avoiding the magic code. Then on subsequent calls it's only required that we read a memory fenced bool which should be quick.

@ikrommyd
Copy link
Contributor

Yeah you can also get other errors from this line like

>       assert set(cset) == {"test corr"}
               ^^^^^^^^^
E       RuntimeError: generic_type: type "iterator" is already registered!

tests/test_core.py:47: RuntimeError

That made me have a feeling that multiple threads call set(cset) in parallel and they all try to register the iterator type at the same time. I still don't know at all whether this is a cpython problem or a pybind11 problem. I think it's at least not a correctionlib problem.

@ikrommyd
Copy link
Contributor

ikrommyd commented Jan 21, 2026

The mutex approach also seems to fix it. I suppose, if it's a one-time initialization thing in pybind I could change it to use an std::atomic<bool> for the first time we do it and go completely lockless, while also avoiding the magic code. Then on subsequent calls it's only required that we read a memory fenced bool which should be quick.

I think you are missing the py::mod_gil_not_used() in PYBIND11_MODULE atm

@lgray
Copy link
Contributor Author

lgray commented Jan 21, 2026

I'm loading without the nogil checks in python using the commandline args, so it's doing the right thing testing-wise but let me mark it to it always loads properly.

@ikrommyd
Copy link
Contributor

ikrommyd commented Jan 21, 2026

I don't know if the mutex is safe. Even though it passes the tests for me locally too under a release build of cpython 3.14, I get a deadlock if I cherry-pick this PR python/cpython#133177 on top of 3.14 which fixes the original TSAN I was seeing. Meaning that it will be a problem in python 3.15 or in 3.14 if this PR gets backported.

@ikrommyd
Copy link
Contributor

I have created this reproducer
https://github.com/ikrommyd/pybind11_iterator_race
and opened an issue on pybind11
pybind/pybind11#5970

@lgray
Copy link
Contributor Author

lgray commented Jan 21, 2026

Sweet! thanks.

@lgray
Copy link
Contributor Author

lgray commented Jan 21, 2026

@nsmith- so we now have something that works but it requires quite a bit of testing infrastructure update.

I'll start making PRs to get everything else updated and then probably make a clean PR with the freethreading update alone.

- uses: pypa/[email protected]
env:
CIBW_ARCHS: ${{ matrix.arch }}
CIBW_ENABLE: cpython-freethreading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You only need this for 3.13t, 3.14t is not experimental so it's on by default.

@ikrommyd
Copy link
Contributor

@lgray @nsmith- once pybind/pybind11#5971 is merged, we can just update the pybind11 submodule to point to master and we don't need any workaround for thread-safety.

@nsmith-
Copy link
Collaborator

nsmith- commented Jan 23, 2026

Great! I'd slightly prefer to wait for a pybind11 release tag.

@lgray
Copy link
Contributor Author

lgray commented Jan 23, 2026

Same - I'll wait for the next pybind release with that fix in. Better way to do it, fewer required changes here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants