From ed37999bcc4078d5b47d04e252e844318ba26ae4 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 13:11:42 -0400 Subject: [PATCH 01/51] First draft. --- peps/pep-0788.rst | 234 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 234 insertions(+) create mode 100644 peps/pep-0788.rst diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst new file mode 100644 index 00000000000..77aeecf7d9a --- /dev/null +++ b/peps/pep-0788.rst @@ -0,0 +1,234 @@ +PEP: 788 +Title: Correct C APIs for native threads +Author: Peter Bierma +Sponsor: Victor Stinner +Discussions-To: Pending +Status: Draft +Type: Standards Track +Created: 19-Apr-2025 +Python-Version: 3.15 +Post-History: + + +Abstract +======== + +:c:func:`PyGILState_Ensure` (and friends) is the most common way to create native threads that interact with Python, but it's unfortunately quite broken, misleading, and most importantly limiting for users. In particular, :c:func:`PyGILState_Ensure` and related functions have the following issues: + +- They aren't safe for finalization, either hanging the calling thread or crashing it with a segmentation fault. +- Subinterpreters don't play nicely with them, because they all assume that the main interpreter is the only one that exists. +- The term "GIL" in the name is confusing for both users of free-threaded CPython and users of subinterpreters. + +This PEP intends to fix all of these issues by starting from scratch and providing new, semantically clear, and most importantly thread-safe APIs for interacting with Python from native threads. + +Motivation +========== + +Native threads will always hang during finalization +--------------------------------------------------- + +Many codebases might need to call Python code in highly-asynchronous situations where the interpreter is already finalizing, or might finalize, and want to continue running code after the Python call. This desire has been `brought up by users `_. For example, a callback that wants to call Python code might be invoked by: + +- A kernel has finished running on the GPU +- A network packet was received. +- A thread has quit, and the C++ library is executing static finalizers of thread local storage. + +In the current C API, any non-Python thread (*i.e.*, not created by :mod:`threading`) is considered to be "daemon," meaning that the interpreter won't wait on that thread to finalize. Instead, the interpreter will hang the thread when it goes to `attach `_ a `thread state`_, making it unusable past that point. Attaching a thread state can happen at any point when invoking Python, such as releasing the GIL in-between bytecode instructions, or when a C function exits a :c:macro:`Py_BEGIN_ALLOW_THREADS` block. + +This means that any non-Python thread may be terminated at any point, which is severely limiting for users who want to do more than just execute Python code in their stream of calls (for example, C++ executing finalizers in *addition* to calling Python). + +``Py_IsFinalizing`` is insufficient +*********************************** + +The `docs `_ currently recommend :c:func:`Py_IsFinalizing` to guard against termination of the thread: + + Calling this function from a thread when the runtime is finalizing will terminate the thread, even if the thread was not created by Python. You can use Py_IsFinalizing() or sys.is_finalizing() to check if the interpreter is in process of being finalized before calling this function to avoid unwanted termination. + +Unfortunately, this isn't correct, because of time-of-call to time-of-use issues; the interpreter might not be finalizing during the call to :c:func:`Py_IsFinalizing`, but it might start finalizing right after, which would cause the attachment of a thread state (typically via :c:func:`PyGILState_Ensure`) to hang the thread. + +We can't change finalization behavior for ``PyGILState_Ensure`` +*************************************************************** + +There will always have to be a point in a Python programs where :c:func:`PyGILState_Ensure` can no longer acquire the GIL. If the interpreter is long dead, then Python obviously can't give a thread a way to invoke it. :c:func:`PyGILState_Ensure` doesn't have any meaningful way to return a failure, so it has no choice but to terminate the thread or emit a fatal error, as noted in `gh-124622 `_: + + I think a new GIL acquisition and release C API would be needed. The way the existing ones get used in existing C code is not amenible to suddenly bolting an error state onto; none of the existing C code is written that way. After the call they always just assume they have the GIL and can proceed. The API was designed as "it'll block and only return once it has the GIL" without any other option. + +``PyGILState`` is broken and misleading +--------------------------------------- + +There are currently two public ways for a user to create and attach their own `thread state`_; manual use of :c:func:`PyThreadState_New` / :c:func:`PyThreadState_Swap`, and :c:func:`PyGILState_Ensure`. The former, :c:func:`PyGILState_Ensure`, is `significantly more common `_. + +``PyGILState`` generally crashes during finalization +**************************************************** + +As of this PEP, the current behavior of :c:func:`PyGILState_Ensure` does not match the documentation. Instead of hanging the thread during finalization as previously noted, it's extremely common for it to crash with a segmentation fault. This is a `known issue `_ that could, in theory, be fixed in CPython, but it's definitely worth noting here. Incidentally, acceptance and implementation of this PEP will likely fix the existing crashes caused by :c:func:`PyGILState_Ensure`. + +``PyGILState`` is tricky for free-threading +******************************************* + +A large issue with the term "GIL" in the C API is that it's semantically misleading, as noted in `gh-127989 `_ (disclaimer: the author of this PEP also authored that issue): + + The biggest issue is that for free-threading, there is no GIL, so users erroneously call the C API inside ``Py_BEGIN_ALLOW_THREADS`` blocks or omit ``PyGILState_Ensure`` in fresh threads. + +Subinterpreters don't work with ``PyGILState`` +---------------------------------------------- + +As noted in the `documentation `_, ``PyGILState`` APIs aren't officially supported in subinterpreters: + + Note that the ``PyGILState_*`` functions assume there is only one global interpreter (created automatically by ``Py_Initialize()``). Python supports the creation of additional interpreters (using ``Py_NewInterpreter()``), but mixing multiple interpreters and the ``PyGILState_*`` API is unsupported. + +More technically, this is because ``PyGILState_Ensure`` doesn't have any way to know which interpreter created the thread, and as such, it has to assume that it was the main interpreter. There isn't any way to detect this at runtime, so spurious races are bound to come up in threads created by subinterpreters, because synchronization for the wrong interpreter will be used on objects shared between the threads. + +Interpreters can concurrently shut down +*************************************** + +The other way of creating a native thread that can invoke Python, :c:func:`PyThreadState_New` / :c:func:`PyThreadState_Swap`, is a lot better for supporting subinterpreters (because :c:func:`PyThreadState_New` takes an explicit interpreter, rather than assuming that the main interpreter was intended), but is still limited by the current API. + +In particular, subinterpreters typically have a much shorter lifetime than the main interpreter, and as such, there's not necessarily a guarantee that a :c:type:`PyInterpreterState` (acquired by :c:func:`PyInterpreterState_Get`) passed to a fresh thread will still be alive. Similarly, a :c:type:`PyInterpreterState` pointer could have been replaced with a *new* interpreter, causing all sorts of unknown issues. + +Rationale +========= + +This PEP includes several new APIs that intend to fix all of the issues stated above. + +Bikeshedding and the ``PyThreadState`` namespace +------------------------------------------------ + +To solve the issue with "GIL" terminology, the new functions intended as replacements for ``PyGILState`` will go under the existing ``PyThreadState`` namespace. In Python 3.14, the documentation has been `updated `_ to switch over to terms using "thread state" instead of "global interpreter lock" or "GIL," so this namespace seems to fit well for the functions in this PEP. + +Full deprecation of ``PyGILState`` +---------------------------------- + +As made clear in the motivation, ``PyGILState`` is already pretty buggy, and even if it was magically fixed, the current behavior of hanging the thread is beyond repair. As such, this PEP intends to completely deprecate the existing ``PyGILState`` APIs. However, even if this PEP is rejected, all of the APIs can be replaced with more correct ``PyThreadState`` functions in the current C API: + +- :c:func:`PyGILState_Ensure`: :c:func:`PyThreadState_Swap` / :c:func:`PyThreadState_New` +- :c:func:`PyGILState_Release`: :c:func:`PyThreadState_Clear` / :c:func:`PyThreadState_Delete` +- :c:func:`PyGILState_GetThisThreadState`: :c:func:`PyThreadState_Get` +- :c:func:`PyGILState_Check`: ``PyThreadState_GetUnchecked() != NULL`` + +Hiding away thread state details +-------------------------------- + +This API intentionally has a layer of "magic" that is kept from the user, for simplicity's sake in the transition from ``PyGILState`` and for ease-of-use on those that wrap the C API, such as in Cython or PyO3. + +See also :ref:`Activate Deactivate Instead`. + +Specification +============= + +Interpreter reference counts +---------------------------- + +.. c:function:: PyInterpreterState *PyInterpreterState_Hold(void) + + Similar to :c:func:`PyInterpreterState_Get`, but returns a strong reference to the interpreter (meaning, it has its reference count incremented by one, temporarily preventing the interpreter from shutting down). + + This function is generally meant to be used in tandem with :c:func:`PyThreadState_Ensure`, and cannot fail. + +.. c:function:: void PyInterpreterState_Release(PyInterpreterState *interp) + + Decrement the reference count of the interpreter. This function mainly exists for completeness, and should rarely be used; nearly all references returned by :c:func:`PyInterpreterState_Hold` should be released by :c:func:`PyThreadState_Ensure`. + + This function cannot fail. + +Daemon and non-daemon threads +----------------------------- + +.. c:function:: int PyThreadState_PreventShutdown(void) + + Mark the `attached thread state`_ as "non-daemon," meaning the current interpreter will wait for this thread to call :c:func:`PyThreadState_Delete` before shutting down. + The attached thread state must not be the main thread for the interpreter. + + Return zero on success, non-zero *without* an exception set on failure. Failure generally means that native threads have already finalized for the current interpreter. + +.. c:function:: void PyThreadState_AllowShutdown(void) + + Mark the `attached thread state`_ as "daemon," allowing the current interpreter to finalize without waiting for this thread to finish. The attached thread state must not be the main thread for the interpreter. Note that all thread states that aren't created by :c:func:`PyThreadState_Ensure` are daemon by default. + + This function cannot fail, but after calling this function, or while calling this function, Python may hang this thread. + +Ensuring and releasing thread states +------------------------------------ + +.. c:function:: int PyThreadState_Ensure(PyInterpreterState *interp) + + Ensure that the thread has an `attached thread state`_ for *interp*, and thus can safely invoke that interpreter. + It is OK to call this function if the thread already has an attached thread state, as long as there is a subsequent call to :c:func:`PyThreadState_Release` that matches this one. + + This function steals a reference to *interp*; as in, the interpreter's reference count is decremented by one. + + Thread states created by this function are automatically "non-daemon," and as such, they prevent the interpreter specified by *interp* from shutting down. + + Return zero on success, and non-zero with the old `attached thread state`_ restored (which may have been ``NULL``). + +.. c:function:: void PyThreadState_Release() + + Detach and destroy the `attached thread state`_ set by :c:func:`PyThreadState_Ensure`. + + This function cannot fail, but may hang the thread if the `attached thread state`_ prior to the original :c:func:`PyThreadState_Ensure` was daemon, and if its interpreter is finalizing. + +Deprecation of ``PyGILState`` +----------------------------- + +This PEP deprecates all of the existing ``PyGILState`` APIs in favor of the new ``PyThreadState`` APIs for the reasons given in the motivation. Namely: + +- :c:func:`PyGILState_Ensure`: use :c:func:`PyThreadState_Ensure` instead. +- :c:func:`PyGILState_Release`: use :c:func:`PyThreadState_Release` instead. +- :c:func:`PyGILState_GetThisThreadState`: use :c:func:`PyThreadState_Get` or :c:func:`PyThreadState_GetUnchecked` instead. +- :c:func:`PyGILState_Check`: use ``PyThreadState_GetUnchecked() != NULL`` instead. + +All of the ``PyGILState`` APIs are to be removed from the non-limited C API in Python 3.25. They will remain available in the limited API for compatibility. + +Backwards Compatibility +======================= + +This PEP specifies a breaking change with the removal of all the ``PyGILState`` APIs from the non-limited C API in 10 years (Python 3.25). + +Reference Implementation +======================== + +TBD. + +Rejected Ideas +============== + +Using an interpreter ID instead of a interpreter state +------------------------------------------------------ + +Some iterations of this API took an ``int64_t interp_id`` parameter instead of ``PyInterpreterState *interp``, because interpreter IDs cannot be concurrently deleted and cause use-after-free violations. However, :c:type:`PyInterpreterState` pointers are a lot simpler to use, and :c:func:`PyInterpreterState_Hold` prevents the interpreter from finalizing until :c:func:`PyThreadState_Ensure` is called anyway. + +.. _Activate Deactivate Instead: + +Exposing an ``Activate``/``Deactivate`` API instead of ``Ensure``/``Clear`` +--------------------------------------------------------------------------- + +In prior discussions of this API, it was `suggested `_ to provide actual :c:type:`PyThreadState` pointers in the API in an attempt to make the ownership and lifetime of the thread state clearer: + + More importantly though, I think this makes it clearer who owns the thread state - a manually created one is controlled by the code that created it, and once it's deleted it can't be activated again. + +This was ultimately rejected for two reasons: + +1. The proposed API has closer usage to :c:func:`PyGILState_Ensure` / :c:func:`PyGILState_Release`, which helps ease the transition for old codebases. +2. It's `significantly easier `_ for code-generators like Cython to use, as there isn't any additional complexity with tracking :c:type:`PyThreadState` pointers around. + +Open Issues +=========== + +Use ``PyStatus`` for the return value of ``PyThreadState_Ensure``? +------------------------------------------------------------------ + +:c:func:`PyThreadState_Ensure` returns an integer to return failures, but some iterations have suggested the use of :c:type:`PyStatus` to denote failure, which has the benefit of providing an error message. The main hesitation for switching to ``PyStatus`` is that it's more difficult to use, as the ``PyStatus`` has to be stored and checked, whereas a simple integer can simply be used inline with an ``if`` clause. + +Additionally, it's `not clear `_ that an error message would be all that useful; all the conceived use-cases for this API wouldn't really care about a message indicating why Python can't be invoked. + +Footnotes +========= + +.. _Thread State: https://docs.python.org/3.14/glossary.html#term-thread-state +.. _Attached Thread State: https://docs.python.org/3.14/glossary.html#term-attached-thread-state + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive. From bc24887e7ac9bfcaea7f58ce3079c7acc1e76308 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 15:47:39 -0400 Subject: [PATCH 02/51] Second draft. --- peps/pep-0788.rst | 213 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 186 insertions(+), 27 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 77aeecf7d9a..336ebacb774 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -1,26 +1,38 @@ PEP: 788 -Title: Correct C APIs for native threads +Title: Reimagining native threads Author: Peter Bierma Sponsor: Victor Stinner Discussions-To: Pending Status: Draft Type: Standards Track -Created: 19-Apr-2025 +Created: 23-Apr-2025 Python-Version: 3.15 -Post-History: +Post-History: `10-Mar-2025 `_ Abstract ======== -:c:func:`PyGILState_Ensure` (and friends) is the most common way to create native threads that interact with Python, but it's unfortunately quite broken, misleading, and most importantly limiting for users. In particular, :c:func:`PyGILState_Ensure` and related functions have the following issues: +:c:func:`PyGILState_Ensure`, :c:func:`PyGILState_Ensure`, and other related functions in the ``PyGILState`` family, are the most common way to create native threads that interact with Python, and have been the standard for over twenty years (:pep:`311`). But, as Python has grown, these functions have become problematic: -- They aren't safe for finalization, either hanging the calling thread or crashing it with a segmentation fault. -- Subinterpreters don't play nicely with them, because they all assume that the main interpreter is the only one that exists. -- The term "GIL" in the name is confusing for both users of free-threaded CPython and users of subinterpreters. +- They aren't safe for finalization, either hanging the calling thread or crashing it with a segmentation fault, preventing further execution. +- When they're called before finalization, they force the thread to be "daemon," meaning that the interpreter won't wait for it to reach any point of execution. This is mostly frustrating for developers, but can lead to deadlocks! +- Subinterpreters don't play nicely with them, because they all assume that the main interpreter is the only one that exists. A fresh thread (*i.e.*, has never had a thread state) that calls :c:func:`PyGILState_Ensure` will always be for the main interpreter. +- The term "GIL" in the name is quite confusing for users of free-threaded Python. There isn't a GIL, why do they still have to call it? -This PEP intends to fix all of these issues by starting from scratch and providing new, semantically clear, and most importantly thread-safe APIs for interacting with Python from native threads. +This PEP intends to fix all of these issues by providing :c:func:`PyThreadState_Ensure` and :c:func:`PyThreadState_Release` as a more correct and safer replacement for :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. For example: +.. code-block:: C + + if (PyThreadState_Ensure(interp) < 0) { + fputs("Python is shutting down", stderr); + return; + } + + /* Interact with Python, without worrying about finalization. */ + // ... + + PyThreadState_Release(); Motivation ========== @@ -46,10 +58,27 @@ The `docs `_ curr Unfortunately, this isn't correct, because of time-of-call to time-of-use issues; the interpreter might not be finalizing during the call to :c:func:`Py_IsFinalizing`, but it might start finalizing right after, which would cause the attachment of a thread state (typically via :c:func:`PyGILState_Ensure`) to hang the thread. +Daemon threads can cause finalization deadlocks +*********************************************** + +When acquiring locks, it's extremely important to detach the thread state to prevent deadlocks. This is true on both the GIL-ful and free-threaded builds. In a GIL-icious build, a deadlock can occur pretty easily when acquiring a lock if the GIL wasn't released, and lock-ordering deadlocks can still occur free-threaded builds if the thread state wasn't detached. + +So, all code that needs to work with locks need to detach the thread state. In C, this is almost always done via :c:macro:`Py_BEGIN_ALLOW_THREADS` and :c:macro:`Py_END_ALLOW_THREADS`, in a code block that looks something like this: + +.. code-block:: C + + Py_BEGIN_ALLOW_THREADS + acquire_lock(); + Py_END_ALLOW_THREADS + +Again, in a daemon thread, :c:macro:`Py_END_ALLOW_THREADS` will hang the thread if the interpreter is finalizing. But, :c:macro:`Py_BEGIN_ALLOW_THREADS` will *not* hang the thread; the lock will be acquired, and *then* hung! Once that happens, nothing can try to acquire that lock without deadlocking. The main thread will continue to run finalizers past that point, though. If any of those finalizers try to acquire the lock, deadlock ensues. + +This affects the Python core itself, and there's not much that can be done to fix it. For example, `gh-129536 `_ remarks that the :mod:`ssl` module can emit a fatal error when used at finalization, because a daemon thread got hung while holding the lock. There are workarounds for this for pure-Python code, but native threads don't have such an option. + We can't change finalization behavior for ``PyGILState_Ensure`` *************************************************************** -There will always have to be a point in a Python programs where :c:func:`PyGILState_Ensure` can no longer acquire the GIL. If the interpreter is long dead, then Python obviously can't give a thread a way to invoke it. :c:func:`PyGILState_Ensure` doesn't have any meaningful way to return a failure, so it has no choice but to terminate the thread or emit a fatal error, as noted in `gh-124622 `_: +There will always have to be a point in a Python program where :c:func:`PyGILState_Ensure` can no longer acquire the GIL (or more correctly, attach a thread state). If the interpreter is long dead, then Python obviously can't give a thread a way to invoke it. :c:func:`PyGILState_Ensure` doesn't have any meaningful way to return a failure, so it has no choice but to terminate the thread or emit a fatal error, as noted in `gh-124622 `_: I think a new GIL acquisition and release C API would be needed. The way the existing ones get used in existing C code is not amenible to suddenly bolting an error state onto; none of the existing C code is written that way. After the call they always just assume they have the GIL and can proceed. The API was designed as "it'll block and only return once it has the GIL" without any other option. @@ -70,6 +99,8 @@ A large issue with the term "GIL" in the C API is that it's semantically mislead The biggest issue is that for free-threading, there is no GIL, so users erroneously call the C API inside ``Py_BEGIN_ALLOW_THREADS`` blocks or omit ``PyGILState_Ensure`` in fresh threads. +In reality, :c:func:`PyGILState_Ensure` doesn't just "acquire the GIL" in modern versions. It attaches a `thread state`_ for the current thread--*that's* what lets a thread invoke the C API. On GIL-ful builds, holding an `attached thread state`_ implies holding the GIL, so only one thread can have one at a time. Free-threaded builds achieve the effect of multi-core parallism while remaining backwards-compatible by simply removing that limitation: threads still need a thread state (and thus need to call :c:func:`PyGILState_Ensure`), but they don't need to wait on one another to do so. + Subinterpreters don't work with ``PyGILState`` ---------------------------------------------- @@ -84,7 +115,7 @@ Interpreters can concurrently shut down The other way of creating a native thread that can invoke Python, :c:func:`PyThreadState_New` / :c:func:`PyThreadState_Swap`, is a lot better for supporting subinterpreters (because :c:func:`PyThreadState_New` takes an explicit interpreter, rather than assuming that the main interpreter was intended), but is still limited by the current API. -In particular, subinterpreters typically have a much shorter lifetime than the main interpreter, and as such, there's not necessarily a guarantee that a :c:type:`PyInterpreterState` (acquired by :c:func:`PyInterpreterState_Get`) passed to a fresh thread will still be alive. Similarly, a :c:type:`PyInterpreterState` pointer could have been replaced with a *new* interpreter, causing all sorts of unknown issues. +In particular, subinterpreters typically have a much shorter lifetime than the main interpreter, and as such, there's not necessarily a guarantee that a :c:type:`PyInterpreterState` (acquired by :c:func:`PyInterpreterState_Get`) passed to a fresh thread will still be alive. Similarly, a :c:type:`PyInterpreterState` pointer could have been replaced with a *new* interpreter, causing all sorts of unknown issues. They are also subject to all the finalization related hanging mentioned previously. Rationale ========= @@ -116,48 +147,60 @@ See also :ref:`Activate Deactivate Instead`. Specification ============= -Interpreter reference counts ----------------------------- +Interpreter reference counting +------------------------------ + +Internally, the interpreter will have to keep track of a reference count field, which will determine when the interpreter state is actually deallocated. This is done to prevent use-after-free crashes in :c:func:`PyThreadState_Ensure` for interpreters with short lifetimes. + +An interpreter state returned by :c:func:`Py_NewInterpreter` (or more technically, :c:func:`PyInterpreterState_New`) will start with a reference count of 1, and :c:func:`PyInterpreterState_Delete` will decrement the reference count. If the new reference count is zero, :c:func:`PyInterpreterState_Delete` will deallocate the interpreter state. However, the reference count will *not* prevent the interpreter from finalizing. .. c:function:: PyInterpreterState *PyInterpreterState_Hold(void) - Similar to :c:func:`PyInterpreterState_Get`, but returns a strong reference to the interpreter (meaning, it has its reference count incremented by one, temporarily preventing the interpreter from shutting down). + Similar to :c:func:`PyInterpreterState_Get`, but returns a strong reference to the interpreter (meaning, it has its reference count incremented by one, allowing the returned interpreter state to be safely accessed by another thread). + + This function is generally meant to be used in tandem with :c:func:`PyThreadState_Ensure`. + + The caller must have an `attached thread state`_, and cannot return a failure. - This function is generally meant to be used in tandem with :c:func:`PyThreadState_Ensure`, and cannot fail. .. c:function:: void PyInterpreterState_Release(PyInterpreterState *interp) Decrement the reference count of the interpreter. This function mainly exists for completeness, and should rarely be used; nearly all references returned by :c:func:`PyInterpreterState_Hold` should be released by :c:func:`PyThreadState_Ensure`. - This function cannot fail. + This function cannot fail, other than with a fatal error. The caller must have an `attached thread state`_ for *interp*. + Daemon and non-daemon threads ----------------------------- -.. c:function:: int PyThreadState_PreventShutdown(void) +This PEP introduces the concept of non-daemon thread states. By default, all threads created without the :mod:`threading` module will hang when trying to attach a thread state for a finalizing interpreter (in fact, daemon threads that *are* created with the :mod:`threading` module will hang in the same way). This generally happens when a thread calls :c:func:`PyEval_RestoreThread` or in between bytecode instructions, based on :func:`sys.setswitchinterval`. - Mark the `attached thread state`_ as "non-daemon," meaning the current interpreter will wait for this thread to call :c:func:`PyThreadState_Delete` before shutting down. - The attached thread state must not be the main thread for the interpreter. +A new, internal field will be added to the ``PyThreadState`` structure that determines if the thread is daemon. If the thread is daemon, then it will hang during attachment as usual, but if it's not, then the interpreter will let the thread attach and continue execution. On GIL-ful builds, this again means handing off the GIL to the thread. During finalization, the interpreter will wait until all non-daemon threads call :c:func:`PyThreadState_Delete`. - Return zero on success, non-zero *without* an exception set on failure. Failure generally means that native threads have already finalized for the current interpreter. +For backwards compatibility, all thread states created by existing APIs will remain daemon by default. -.. c:function:: void PyThreadState_AllowShutdown(void) +.. c:function:: int PyThreadState_SetDaemon(int is_daemon) - Mark the `attached thread state`_ as "daemon," allowing the current interpreter to finalize without waiting for this thread to finish. The attached thread state must not be the main thread for the interpreter. Note that all thread states that aren't created by :c:func:`PyThreadState_Ensure` are daemon by default. + Set the `attached thread state`_ as non-daemon or daemon. The attached thread state must not be the main thread for the interpreter. + All thread states created without :c:func:`PyThreadState_Ensure` are daemon by default. + + If the thread state is non-daemon, then the current interpreter will wait for this thread to finish before shutting down. See also :meth:`threading.Thread.setDaemon`. - This function cannot fail, but after calling this function, or while calling this function, Python may hang this thread. + Return zero on success, non-zero *without* an exception set on failure. Failure generally means that threads have already finalized for the current interpreter. Ensuring and releasing thread states ------------------------------------ +This proposal includes two new high-level threading APIs that intend to replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. + .. c:function:: int PyThreadState_Ensure(PyInterpreterState *interp) Ensure that the thread has an `attached thread state`_ for *interp*, and thus can safely invoke that interpreter. It is OK to call this function if the thread already has an attached thread state, as long as there is a subsequent call to :c:func:`PyThreadState_Release` that matches this one. - This function steals a reference to *interp*; as in, the interpreter's reference count is decremented by one. + This function always steals a reference to *interp*; as in, the interpreter's reference count is decremented by one. As such, *interp* should have been acquired by :c:func:`PyInterpreterState_Hold`. - Thread states created by this function are automatically "non-daemon," and as such, they prevent the interpreter specified by *interp* from shutting down. + Thread states created by this function are non-daemon by default. See :c:func:`PyThreadState_SetDaemon`. If the calling thread already has an `attached thread state`_ that matches *interp*, then this function will simply mark the existing thread state as non-daemon and return. It will be restored to its prior daemon status upon the next :c:func:`PyThreadState_Release` call. Return zero on success, and non-zero with the old `attached thread state`_ restored (which may have been ``NULL``). @@ -165,7 +208,7 @@ Ensuring and releasing thread states Detach and destroy the `attached thread state`_ set by :c:func:`PyThreadState_Ensure`. - This function cannot fail, but may hang the thread if the `attached thread state`_ prior to the original :c:func:`PyThreadState_Ensure` was daemon, and if its interpreter is finalizing. + This function cannot fail, but may hang the thread if the `attached thread state`_ prior to the original :c:func:`PyThreadState_Ensure` was daemon. Deprecation of ``PyGILState`` ----------------------------- @@ -177,12 +220,123 @@ This PEP deprecates all of the existing ``PyGILState`` APIs in favor of the new - :c:func:`PyGILState_GetThisThreadState`: use :c:func:`PyThreadState_Get` or :c:func:`PyThreadState_GetUnchecked` instead. - :c:func:`PyGILState_Check`: use ``PyThreadState_GetUnchecked() != NULL`` instead. -All of the ``PyGILState`` APIs are to be removed from the non-limited C API in Python 3.25. They will remain available in the limited API for compatibility. +All of the ``PyGILState`` APIs are to be removed from the non-limited C API in Python 3.25. They will remain available in the stable API for compatibility. Backwards Compatibility ======================= -This PEP specifies a breaking change with the removal of all the ``PyGILState`` APIs from the non-limited C API in 10 years (Python 3.25). +This PEP specifies a breaking change with the removal of all the ``PyGILState`` APIs from the public headers of the non-limited C API in 10 years (Python 3.25). + +How to Teach This +================= + +As with all C API functions, all the new APIs in this PEP will be documented in the C API documentation, ideally under the `Non-Python created threads `_ section. The existing `High level API `_ section, containing most of the ``PyGILState`` documentation, should be updated accordingly to point to the new APIs. + +Examples +-------- + +These examples are here to help understand the APIs described in this PEP. Ideally, they could be reused in the documentation. + +Single-threaded example +*********************** + +.. code-block:: C + + static PyObject * + my_critical_operation(PyObject *self, PyObject *unused) + { + assert(PyThreadState_GetUnchecked() != NULL); + PyInterpreterState *interp = PyInterpreterState_Hold(); + if (PyThreadState_Ensure(interp) < 0) { + PyErr_SetString(PyExc_RuntimeError, "interpreter is shutting down"); + return NULL; + } + + Py_BEGIN_ALLOW_THREADS; + acquire_some_lock(); + /* If this were to be a daemon thread, then the interpreter could + hang the thread while reattaching the thread state, leaving us + with the lock held. Any future finalizer that wanted to acquire the + lock would be deadlocked! + */ + Py_END_ALLOW_THREADS; + + PyThreadState_Release(); + Py_RETURN_NONE; + } + +Transition from ``PyGILState`` example +************************************** + +The following code uses the old ``PyGILState`` APIs: + +.. code-block:: C + + static int + thread_func(void *arg) + { + PyGILState_STATE gstate = PyGILState_Ensure(); + /* It's not an issue in this example, but we just attached + a thread state for the main interpreter. If my_method() was + originally called in a subinterpreter, then we would be unable + to safely interact with any objects from it. */ + if (PyRun_SimpleString("print(42)") < 0) { + PyErr_Print(); + } + PyGILState_Release(gstate); + return 0; + } + + static PyObject * + my_method(PyObject *self, PyObject *unused) + { + PyThread_handle_t handle; + PyThead_indent_t indent; + + if (PyThread_start_joinable_thread(thread_func, NULL, &ident, &handle) < 0) { + return NULL; + } + Py_BEGIN_ALLOW_THREADS; + PyThread_join_thread(handle); + Py_END_ALLOW_THREADS; + Py_RETURN_NONE; + } + +This is the same code, updated to use the new functions: + +.. code-block:: C + + static int + thread_func(void *arg) + { + PyInterpreterState *interp = (PyInterpreterState *)arg; + if (PyThreadState_Ensure(interp) < 0) { + fputs("Cannot talk to Python", stderr); + return -1; + } + if (PyRun_SimpleString("print(42)") < 0) { + PyErr_Print(); + } + PyThreadState_Release(); + return 0; + } + + static PyObject * + my_method(PyObject *self, PyObject *unused) + { + PyThread_handle_t handle; + PyThead_indent_t indent; + + PyInterpreterState *interp = PyInterpreterState_Hold(); + if (PyThread_start_joinable_thread(thread_func, interp, &ident, &handle) < 0) { + return NULL; + } + Py_BEGIN_ALLOW_THREADS + PyThread_join_thread(handle); + Py_END_ALLOW_THREADS + Py_RETURN_NONE; + } + Reference Implementation ======================== @@ -221,6 +375,11 @@ Use ``PyStatus`` for the return value of ``PyThreadState_Ensure``? Additionally, it's `not clear `_ that an error message would be all that useful; all the conceived use-cases for this API wouldn't really care about a message indicating why Python can't be invoked. +When should ``PyGILState`` be removed? +-------------------------------------- + +:c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release` have been around for over two decades, and it's expected that the migration will be difficult. Currently, the plan is to remove them in 10 years (opposed to the 5 years required by :pep:`387`), but this is subject to further discussion, as it's unclear if that's enough (or too much) time. + Footnotes ========= From 4aa0dd6384165ae59470669f91eebfd89f8df3ce Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 16:14:27 -0400 Subject: [PATCH 03/51] Wrap lines. --- peps/pep-0788.rst | 357 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 283 insertions(+), 74 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 336ebacb774..e43a64bc0b5 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -13,14 +13,29 @@ Post-History: `10-Mar-2025 `_. For example, a callback that wants to call Python code might be invoked by: +Many codebases might need to call Python code in highly-asynchronous +situations where the interpreter is already finalizing, or might finalize, and want to continue +running code after the Python call. This desire has been +`brought up by users `_. +For example, a callback that wants to call Python code might be invoked by: - A kernel has finished running on the GPU - A network packet was received. -- A thread has quit, and the C++ library is executing static finalizers of thread local storage. - -In the current C API, any non-Python thread (*i.e.*, not created by :mod:`threading`) is considered to be "daemon," meaning that the interpreter won't wait on that thread to finalize. Instead, the interpreter will hang the thread when it goes to `attach `_ a `thread state`_, making it unusable past that point. Attaching a thread state can happen at any point when invoking Python, such as releasing the GIL in-between bytecode instructions, or when a C function exits a :c:macro:`Py_BEGIN_ALLOW_THREADS` block. - -This means that any non-Python thread may be terminated at any point, which is severely limiting for users who want to do more than just execute Python code in their stream of calls (for example, C++ executing finalizers in *addition* to calling Python). +- A thread has quit, and the C++ library is executing static finalizers of + thread local storage. + +In the current C API, any non-Python thread (*i.e.*, not created by +:mod:`threading`) is considered to be "daemon," meaning that the interpreter +won't wait on that thread to finalize. Instead, the interpreter will hang the +thread when it goes to `attach `_ a `thread state`_, +making it unusable past that point. Attaching a thread state can happen at +any point when invoking Python, such as releasing the GIL in-between bytecode +instructions, or when a C function exits a :c:macro:`Py_BEGIN_ALLOW_THREADS` +block. + +This means that any non-Python thread may be terminated at any point, which +is severely limiting for users who want to do more than just execute Python +code in their stream of calls (for example, C++ executing finalizers in +*addition* to calling Python). ``Py_IsFinalizing`` is insufficient *********************************** -The `docs `_ currently recommend :c:func:`Py_IsFinalizing` to guard against termination of the thread: +The `docs `_ +currently recommend :c:func:`Py_IsFinalizing` to guard against termination of +the thread: - Calling this function from a thread when the runtime is finalizing will terminate the thread, even if the thread was not created by Python. You can use Py_IsFinalizing() or sys.is_finalizing() to check if the interpreter is in process of being finalized before calling this function to avoid unwanted termination. + Calling this function from a thread when the runtime is finalizing will + terminate the thread, even if the thread was not created by Python. You + can use Py_IsFinalizing() or sys.is_finalizing() to check if the + interpreter is in process of being finalized before calling this function + to avoid unwanted termination. -Unfortunately, this isn't correct, because of time-of-call to time-of-use issues; the interpreter might not be finalizing during the call to :c:func:`Py_IsFinalizing`, but it might start finalizing right after, which would cause the attachment of a thread state (typically via :c:func:`PyGILState_Ensure`) to hang the thread. +Unfortunately, this isn't correct, because of time-of-call to time-of-use +issues; the interpreter might not be finalizing during the call to +:c:func:`Py_IsFinalizing`, but it might start finalizing right after, which +would cause the attachment of a thread state (typically via +:c:func:`PyGILState_Ensure`) to hang the thread. Daemon threads can cause finalization deadlocks *********************************************** -When acquiring locks, it's extremely important to detach the thread state to prevent deadlocks. This is true on both the GIL-ful and free-threaded builds. In a GIL-icious build, a deadlock can occur pretty easily when acquiring a lock if the GIL wasn't released, and lock-ordering deadlocks can still occur free-threaded builds if the thread state wasn't detached. +When acquiring locks, it's extremely important to detach the thread state to +prevent deadlocks. This is true on both the GIL-ful and free-threaded builds. +In a GIL-icious build, a deadlock can occur pretty easily when acquiring a +lock if the GIL wasn't released, and lock-ordering deadlocks can still occur +free-threaded builds if the thread state wasn't detached. -So, all code that needs to work with locks need to detach the thread state. In C, this is almost always done via :c:macro:`Py_BEGIN_ALLOW_THREADS` and :c:macro:`Py_END_ALLOW_THREADS`, in a code block that looks something like this: +So, all code that needs to work with locks need to detach the thread state. +In C, this is almost always done via :c:macro:`Py_BEGIN_ALLOW_THREADS` and +:c:macro:`Py_END_ALLOW_THREADS`, in a code block that looks something like this: .. code-block:: C @@ -71,66 +118,140 @@ So, all code that needs to work with locks need to detach the thread state. In C acquire_lock(); Py_END_ALLOW_THREADS -Again, in a daemon thread, :c:macro:`Py_END_ALLOW_THREADS` will hang the thread if the interpreter is finalizing. But, :c:macro:`Py_BEGIN_ALLOW_THREADS` will *not* hang the thread; the lock will be acquired, and *then* hung! Once that happens, nothing can try to acquire that lock without deadlocking. The main thread will continue to run finalizers past that point, though. If any of those finalizers try to acquire the lock, deadlock ensues. +Again, in a daemon thread, :c:macro:`Py_END_ALLOW_THREADS` will hang the thread +if the interpreter is finalizing. But, :c:macro:`Py_BEGIN_ALLOW_THREADS` will +*not* hang the thread; the lock will be acquired, and *then* hung! Once that +happens, nothing can try to acquire that lock without deadlocking. The main +thread will continue to run finalizers past that point, though. If any of +those finalizers try to acquire the lock, deadlock ensues. -This affects the Python core itself, and there's not much that can be done to fix it. For example, `gh-129536 `_ remarks that the :mod:`ssl` module can emit a fatal error when used at finalization, because a daemon thread got hung while holding the lock. There are workarounds for this for pure-Python code, but native threads don't have such an option. +This affects the Python core itself, and there's not much that can be done +to fix it. For example, `gh-129536 `_ +remarks that the :mod:`ssl` module can emit a fatal error when used at +finalization, because a daemon thread got hung while holding the lock. There +are workarounds for this for pure-Python code, but native threads don't have +such an option. We can't change finalization behavior for ``PyGILState_Ensure`` *************************************************************** -There will always have to be a point in a Python program where :c:func:`PyGILState_Ensure` can no longer acquire the GIL (or more correctly, attach a thread state). If the interpreter is long dead, then Python obviously can't give a thread a way to invoke it. :c:func:`PyGILState_Ensure` doesn't have any meaningful way to return a failure, so it has no choice but to terminate the thread or emit a fatal error, as noted in `gh-124622 `_: - - I think a new GIL acquisition and release C API would be needed. The way the existing ones get used in existing C code is not amenible to suddenly bolting an error state onto; none of the existing C code is written that way. After the call they always just assume they have the GIL and can proceed. The API was designed as "it'll block and only return once it has the GIL" without any other option. +There will always have to be a point in a Python program where +:c:func:`PyGILState_Ensure` can no longer acquire the GIL (or more correctly, +attach a thread state). If the interpreter is long dead, then Python +obviously can't give a thread a way to invoke it. +:c:func:`PyGILState_Ensure` doesn't have any meaningful way to return a +failure, so it has no choice but to terminate the thread or emit a fatal +error, as noted in `gh-124622 `_: + + I think a new GIL acquisition and release C API would be needed. The way + the existing ones get used in existing C code is not amenible to suddenly + bolting an error state onto; none of the existing C code is written that + way. After the call they always just assume they have the GIL and can + proceed. The API was designed as "it'll block and only return once it has + the GIL" without any other option. ``PyGILState`` is broken and misleading --------------------------------------- -There are currently two public ways for a user to create and attach their own `thread state`_; manual use of :c:func:`PyThreadState_New` / :c:func:`PyThreadState_Swap`, and :c:func:`PyGILState_Ensure`. The former, :c:func:`PyGILState_Ensure`, is `significantly more common `_. +There are currently two public ways for a user to create and attach their own +`thread state`_; manual use of :c:func:`PyThreadState_New` / :c:func:`PyThreadState_Swap`, +and :c:func:`PyGILState_Ensure`. The former, :c:func:`PyGILState_Ensure`, +is `significantly more common `_. ``PyGILState`` generally crashes during finalization **************************************************** -As of this PEP, the current behavior of :c:func:`PyGILState_Ensure` does not match the documentation. Instead of hanging the thread during finalization as previously noted, it's extremely common for it to crash with a segmentation fault. This is a `known issue `_ that could, in theory, be fixed in CPython, but it's definitely worth noting here. Incidentally, acceptance and implementation of this PEP will likely fix the existing crashes caused by :c:func:`PyGILState_Ensure`. +As of this PEP, the current behavior of :c:func:`PyGILState_Ensure` does not +match the documentation. Instead of hanging the thread during finalization +as previously noted, it's extremely common for it to crash with a segmentation +fault. This is a `known issue `_ +that could, in theory, be fixed in CPython, but it's definitely worth noting +here. Incidentally, acceptance and implementation of this PEP will likely fix +the existing crashes caused by :c:func:`PyGILState_Ensure`. ``PyGILState`` is tricky for free-threading ******************************************* -A large issue with the term "GIL" in the C API is that it's semantically misleading, as noted in `gh-127989 `_ (disclaimer: the author of this PEP also authored that issue): +A large issue with the term "GIL" in the C API is that it's semantically +misleading, as noted in `gh-127989 `_ +(disclaimer: the author of this PEP also authored that issue): - The biggest issue is that for free-threading, there is no GIL, so users erroneously call the C API inside ``Py_BEGIN_ALLOW_THREADS`` blocks or omit ``PyGILState_Ensure`` in fresh threads. + The biggest issue is that for free-threading, there is no GIL, so users + erroneously call the C API inside ``Py_BEGIN_ALLOW_THREADS`` blocks or + omit ``PyGILState_Ensure`` in fresh threads. -In reality, :c:func:`PyGILState_Ensure` doesn't just "acquire the GIL" in modern versions. It attaches a `thread state`_ for the current thread--*that's* what lets a thread invoke the C API. On GIL-ful builds, holding an `attached thread state`_ implies holding the GIL, so only one thread can have one at a time. Free-threaded builds achieve the effect of multi-core parallism while remaining backwards-compatible by simply removing that limitation: threads still need a thread state (and thus need to call :c:func:`PyGILState_Ensure`), but they don't need to wait on one another to do so. +In reality, :c:func:`PyGILState_Ensure` doesn't just "acquire the GIL" in +modern versions. It attaches a `thread state`_ for the current +thread--*that's* what lets a thread invoke the C API. On GIL-ful builds, +holding an `attached thread state`_ implies holding the GIL, so only one +thread can have one at a time. Free-threaded builds achieve the effect of +multi-core parallism while remaining backwards-compatible by simply removing +that limitation: threads still need a thread state (and thus need to call +:c:func:`PyGILState_Ensure`), but they don't need to wait on one another to +do so. Subinterpreters don't work with ``PyGILState`` ---------------------------------------------- -As noted in the `documentation `_, ``PyGILState`` APIs aren't officially supported in subinterpreters: +As noted in the +`documentation `_, +``PyGILState`` APIs aren't officially supported in subinterpreters: - Note that the ``PyGILState_*`` functions assume there is only one global interpreter (created automatically by ``Py_Initialize()``). Python supports the creation of additional interpreters (using ``Py_NewInterpreter()``), but mixing multiple interpreters and the ``PyGILState_*`` API is unsupported. + Note that the ``PyGILState_*`` functions assume there is only one global + interpreter (created automatically by ``Py_Initialize()``). Python + supports the creation of additional interpreters (using + ``Py_NewInterpreter()``), but mixing multiple interpreters and the + ``PyGILState_*`` API is unsupported. -More technically, this is because ``PyGILState_Ensure`` doesn't have any way to know which interpreter created the thread, and as such, it has to assume that it was the main interpreter. There isn't any way to detect this at runtime, so spurious races are bound to come up in threads created by subinterpreters, because synchronization for the wrong interpreter will be used on objects shared between the threads. +More technically, this is because ``PyGILState_Ensure`` doesn't have any way +to know which interpreter created the thread, and as such, it has to assume +that it was the main interpreter. There isn't any way to detect this at +runtime, so spurious races are bound to come up in threads created by +subinterpreters, because synchronization for the wrong interpreter will be +used on objects shared between the threads. Interpreters can concurrently shut down *************************************** -The other way of creating a native thread that can invoke Python, :c:func:`PyThreadState_New` / :c:func:`PyThreadState_Swap`, is a lot better for supporting subinterpreters (because :c:func:`PyThreadState_New` takes an explicit interpreter, rather than assuming that the main interpreter was intended), but is still limited by the current API. +The other way of creating a native thread that can invoke Python, +:c:func:`PyThreadState_New` / :c:func:`PyThreadState_Swap`, is a lot better +for supporting subinterpreters (because :c:func:`PyThreadState_New` takes an +explicit interpreter, rather than assuming that the main interpreter was intended), +but is still limited by the current API. -In particular, subinterpreters typically have a much shorter lifetime than the main interpreter, and as such, there's not necessarily a guarantee that a :c:type:`PyInterpreterState` (acquired by :c:func:`PyInterpreterState_Get`) passed to a fresh thread will still be alive. Similarly, a :c:type:`PyInterpreterState` pointer could have been replaced with a *new* interpreter, causing all sorts of unknown issues. They are also subject to all the finalization related hanging mentioned previously. +In particular, subinterpreters typically have a much shorter lifetime than the +main interpreter, and as such, there's not necessarily a guarantee that a +:c:type:`PyInterpreterState` (acquired by :c:func:`PyInterpreterState_Get`) +passed to a fresh thread will still be alive. Similarly, a +:c:type:`PyInterpreterState` pointer could have been replaced with a *new* +interpreter, causing all sorts of unknown issues. They are also subject to +all the finalization related hanging mentioned previously. Rationale ========= -This PEP includes several new APIs that intend to fix all of the issues stated above. +This PEP includes several new APIs that intend to fix all of the issues stated +above. Bikeshedding and the ``PyThreadState`` namespace ------------------------------------------------ -To solve the issue with "GIL" terminology, the new functions intended as replacements for ``PyGILState`` will go under the existing ``PyThreadState`` namespace. In Python 3.14, the documentation has been `updated `_ to switch over to terms using "thread state" instead of "global interpreter lock" or "GIL," so this namespace seems to fit well for the functions in this PEP. +To solve the issue with "GIL" terminology, the new functions intended as +replacements for ``PyGILState`` will go under the existing ``PyThreadState`` +namespace. In Python 3.14, the documentation has been +`updated `_ to switch over to +terms using "thread state" instead of "global interpreter lock" or "GIL," so +this namespace seems to fit well for the functions in this PEP. Full deprecation of ``PyGILState`` ---------------------------------- -As made clear in the motivation, ``PyGILState`` is already pretty buggy, and even if it was magically fixed, the current behavior of hanging the thread is beyond repair. As such, this PEP intends to completely deprecate the existing ``PyGILState`` APIs. However, even if this PEP is rejected, all of the APIs can be replaced with more correct ``PyThreadState`` functions in the current C API: +As made clear in the motivation, ``PyGILState`` is already pretty buggy, and +even if it was magically fixed, the current behavior of hanging the thread is +beyond repair. As such, this PEP intends to completely deprecate the existing +``PyGILState`` APIs. However, even if this PEP is rejected, all of the APIs +can be replaced with more correct ``PyThreadState`` functions in the current +C API: - :c:func:`PyGILState_Ensure`: :c:func:`PyThreadState_Swap` / :c:func:`PyThreadState_New` - :c:func:`PyGILState_Release`: :c:func:`PyThreadState_Clear` / :c:func:`PyThreadState_Delete` @@ -140,7 +261,9 @@ As made clear in the motivation, ``PyGILState`` is already pretty buggy, and eve Hiding away thread state details -------------------------------- -This API intentionally has a layer of "magic" that is kept from the user, for simplicity's sake in the transition from ``PyGILState`` and for ease-of-use on those that wrap the C API, such as in Cython or PyO3. +This API intentionally has a layer of "magic" that is kept from the user, for +simplicity's sake in the transition from ``PyGILState`` and for ease-of-use on +those that wrap the C API, such as in Cython or PyO3. See also :ref:`Activate Deactivate Instead`. @@ -150,43 +273,76 @@ Specification Interpreter reference counting ------------------------------ -Internally, the interpreter will have to keep track of a reference count field, which will determine when the interpreter state is actually deallocated. This is done to prevent use-after-free crashes in :c:func:`PyThreadState_Ensure` for interpreters with short lifetimes. +Internally, the interpreter will have to keep track of a reference count +field, which will determine when the interpreter state is actually +deallocated. This is done to prevent use-after-free crashes in +:c:func:`PyThreadState_Ensure` for interpreters with short lifetimes. -An interpreter state returned by :c:func:`Py_NewInterpreter` (or more technically, :c:func:`PyInterpreterState_New`) will start with a reference count of 1, and :c:func:`PyInterpreterState_Delete` will decrement the reference count. If the new reference count is zero, :c:func:`PyInterpreterState_Delete` will deallocate the interpreter state. However, the reference count will *not* prevent the interpreter from finalizing. +An interpreter state returned by :c:func:`Py_NewInterpreter` (or really, +:c:func:`PyInterpreterState_New`) will start with a reference count of 1, and +:c:func:`PyInterpreterState_Delete` will decrement the reference count. If the +new reference count is zero, :c:func:`PyInterpreterState_Delete` will +deallocate the interpreter state. However, the reference count will *not* +prevent the interpreter from finalizing. .. c:function:: PyInterpreterState *PyInterpreterState_Hold(void) - Similar to :c:func:`PyInterpreterState_Get`, but returns a strong reference to the interpreter (meaning, it has its reference count incremented by one, allowing the returned interpreter state to be safely accessed by another thread). + Similar to :c:func:`PyInterpreterState_Get`, but returns a strong + reference to the interpreter (meaning, it has its reference count + incremented by one, allowing the returned interpreter state to be safely + accessed by another thread). - This function is generally meant to be used in tandem with :c:func:`PyThreadState_Ensure`. + This function is generally meant to be used in tandem with + :c:func:`PyThreadState_Ensure`. - The caller must have an `attached thread state`_, and cannot return a failure. + The caller must have an `attached thread state`_, and cannot return a + failure. .. c:function:: void PyInterpreterState_Release(PyInterpreterState *interp) - Decrement the reference count of the interpreter. This function mainly exists for completeness, and should rarely be used; nearly all references returned by :c:func:`PyInterpreterState_Hold` should be released by :c:func:`PyThreadState_Ensure`. + Decrement the reference count of the interpreter. This function mainly + exists for completeness, and should rarely be used; nearly all references + returned by :c:func:`PyInterpreterState_Hold` should be released by + :c:func:`PyThreadState_Ensure`. - This function cannot fail, other than with a fatal error. The caller must have an `attached thread state`_ for *interp*. + This function cannot fail, other than with a fatal error. The caller must + have an `attached thread state`_ for *interp*. Daemon and non-daemon threads ----------------------------- -This PEP introduces the concept of non-daemon thread states. By default, all threads created without the :mod:`threading` module will hang when trying to attach a thread state for a finalizing interpreter (in fact, daemon threads that *are* created with the :mod:`threading` module will hang in the same way). This generally happens when a thread calls :c:func:`PyEval_RestoreThread` or in between bytecode instructions, based on :func:`sys.setswitchinterval`. +This PEP introduces the concept of non-daemon thread states. By default, all +threads created without the :mod:`threading` module will hang when trying to +attach a thread state for a finalizing interpreter (in fact, daemon threads +that *are* created with the :mod:`threading` module will hang in the same +way). This generally happens when a thread calls :c:func:`PyEval_RestoreThread` +or in between bytecode instructions, based on :func:`sys.setswitchinterval`. -A new, internal field will be added to the ``PyThreadState`` structure that determines if the thread is daemon. If the thread is daemon, then it will hang during attachment as usual, but if it's not, then the interpreter will let the thread attach and continue execution. On GIL-ful builds, this again means handing off the GIL to the thread. During finalization, the interpreter will wait until all non-daemon threads call :c:func:`PyThreadState_Delete`. +A new, internal field will be added to the ``PyThreadState`` structure that +determines if the thread is daemon. If the thread is daemon, then it will +hang during attachment as usual, but if it's not, then the interpreter will +let the thread attach and continue execution. On GIL-ful builds, this again +means handing off the GIL to the thread. During finalization, the interpreter +will wait until all non-daemon threads call :c:func:`PyThreadState_Delete`. -For backwards compatibility, all thread states created by existing APIs will remain daemon by default. +For backwards compatibility, all thread states created by existing APIs will +remain daemon by default. .. c:function:: int PyThreadState_SetDaemon(int is_daemon) - Set the `attached thread state`_ as non-daemon or daemon. The attached thread state must not be the main thread for the interpreter. - All thread states created without :c:func:`PyThreadState_Ensure` are daemon by default. + Set the `attached thread state`_ as non-daemon or daemon. The attached + thread state must not be the main thread for the interpreter. All thread + states created without :c:func:`PyThreadState_Ensure` are daemon by + default. - If the thread state is non-daemon, then the current interpreter will wait for this thread to finish before shutting down. See also :meth:`threading.Thread.setDaemon`. + If the thread state is non-daemon, then the current interpreter will wait + for this thread to finish before shutting down. See also :meth:`threading.Thread.setDaemon`. - Return zero on success, non-zero *without* an exception set on failure. Failure generally means that threads have already finalized for the current interpreter. + Return zero on success, non-zero *without* an exception set on failure. + Failure generally means that threads have already finalized for the + current interpreter. Ensuring and releasing thread states ------------------------------------ @@ -195,47 +351,73 @@ This proposal includes two new high-level threading APIs that intend to replace .. c:function:: int PyThreadState_Ensure(PyInterpreterState *interp) - Ensure that the thread has an `attached thread state`_ for *interp*, and thus can safely invoke that interpreter. - It is OK to call this function if the thread already has an attached thread state, as long as there is a subsequent call to :c:func:`PyThreadState_Release` that matches this one. + Ensure that the thread has an `attached thread state`_ for *interp*, and + thus can safely invoke that interpreter. It is OK to call this function if + the thread already has an attached thread state, as long as there is a + subsequent call to :c:func:`PyThreadState_Release` that matches this one. - This function always steals a reference to *interp*; as in, the interpreter's reference count is decremented by one. As such, *interp* should have been acquired by :c:func:`PyInterpreterState_Hold`. + This function always steals a reference to *interp*; as in, the + interpreter's reference count is decremented by one. As such, *interp* + should have been acquired by :c:func:`PyInterpreterState_Hold`. - Thread states created by this function are non-daemon by default. See :c:func:`PyThreadState_SetDaemon`. If the calling thread already has an `attached thread state`_ that matches *interp*, then this function will simply mark the existing thread state as non-daemon and return. It will be restored to its prior daemon status upon the next :c:func:`PyThreadState_Release` call. + Thread states created by this function are non-daemon by default. See + :c:func:`PyThreadState_SetDaemon`. If the calling thread already has an + `attached thread state`_ that matches *interp*, then this function will + simply mark the existing thread state as non-daemon and return. It will + be restored to its prior daemon status upon the next + :c:func:`PyThreadState_Release` call. - Return zero on success, and non-zero with the old `attached thread state`_ restored (which may have been ``NULL``). + Return zero on success, and non-zero with the old `attached thread state`_ + restored (which may have been ``NULL``). .. c:function:: void PyThreadState_Release() - Detach and destroy the `attached thread state`_ set by :c:func:`PyThreadState_Ensure`. + Detach and destroy the `attached thread state`_ set by + :c:func:`PyThreadState_Ensure`. - This function cannot fail, but may hang the thread if the `attached thread state`_ prior to the original :c:func:`PyThreadState_Ensure` was daemon. + This function cannot fail, but may hang the thread if the + `attached thread state`_ prior to the original :c:func:`PyThreadState_Ensure` + was daemon. Deprecation of ``PyGILState`` ----------------------------- -This PEP deprecates all of the existing ``PyGILState`` APIs in favor of the new ``PyThreadState`` APIs for the reasons given in the motivation. Namely: +This PEP deprecates all of the existing ``PyGILState`` APIs in favor of the +new ``PyThreadState`` APIs for the reasons given in the motivation. Namely: - :c:func:`PyGILState_Ensure`: use :c:func:`PyThreadState_Ensure` instead. - :c:func:`PyGILState_Release`: use :c:func:`PyThreadState_Release` instead. -- :c:func:`PyGILState_GetThisThreadState`: use :c:func:`PyThreadState_Get` or :c:func:`PyThreadState_GetUnchecked` instead. -- :c:func:`PyGILState_Check`: use ``PyThreadState_GetUnchecked() != NULL`` instead. +- :c:func:`PyGILState_GetThisThreadState`: use :c:func:`PyThreadState_Get` or + :c:func:`PyThreadState_GetUnchecked` instead. +- :c:func:`PyGILState_Check`: use ``PyThreadState_GetUnchecked() != NULL`` + instead. -All of the ``PyGILState`` APIs are to be removed from the non-limited C API in Python 3.25. They will remain available in the stable API for compatibility. +All of the ``PyGILState`` APIs are to be removed from the non-limited C API in +Python 3.25. They will remain available in the stable API for compatibility. Backwards Compatibility ======================= -This PEP specifies a breaking change with the removal of all the ``PyGILState`` APIs from the public headers of the non-limited C API in 10 years (Python 3.25). +This PEP specifies a breaking change with the removal of all the +``PyGILState`` APIs from the public headers of the non-limited C API in 10 +years (Python 3.25). How to Teach This ================= -As with all C API functions, all the new APIs in this PEP will be documented in the C API documentation, ideally under the `Non-Python created threads `_ section. The existing `High level API `_ section, containing most of the ``PyGILState`` documentation, should be updated accordingly to point to the new APIs. +As with all C API functions, all the new APIs in this PEP will be documented +in the C API documentation, ideally under the +`Non-Python created threads `_ +section. The existing +`High level API `_ +section, containing most of the ``PyGILState`` documentation, should be +updated accordingly to point to the new APIs. Examples -------- -These examples are here to help understand the APIs described in this PEP. Ideally, they could be reused in the documentation. +These examples are here to help understand the APIs described in this PEP. +Ideally, they could be reused in the documentation. Single-threaded example *********************** @@ -349,21 +531,35 @@ Rejected Ideas Using an interpreter ID instead of a interpreter state ------------------------------------------------------ -Some iterations of this API took an ``int64_t interp_id`` parameter instead of ``PyInterpreterState *interp``, because interpreter IDs cannot be concurrently deleted and cause use-after-free violations. However, :c:type:`PyInterpreterState` pointers are a lot simpler to use, and :c:func:`PyInterpreterState_Hold` prevents the interpreter from finalizing until :c:func:`PyThreadState_Ensure` is called anyway. +Some iterations of this API took an ``int64_t interp_id`` parameter instead of +``PyInterpreterState *interp``, because interpreter IDs cannot be concurrently +deleted and cause use-after-free violations. However, +:c:type:`PyInterpreterState` pointers are a lot simpler to use, and +:c:func:`PyInterpreterState_Hold` prevents the interpreter from finalizing +until :c:func:`PyThreadState_Ensure` is called anyway. .. _Activate Deactivate Instead: Exposing an ``Activate``/``Deactivate`` API instead of ``Ensure``/``Clear`` --------------------------------------------------------------------------- -In prior discussions of this API, it was `suggested `_ to provide actual :c:type:`PyThreadState` pointers in the API in an attempt to make the ownership and lifetime of the thread state clearer: +In prior discussions of this API, it was +`suggested `_ +to provide actual :c:type:`PyThreadState` pointers in the API in an attempt to +make the ownership and lifetime of the thread state clearer: - More importantly though, I think this makes it clearer who owns the thread state - a manually created one is controlled by the code that created it, and once it's deleted it can't be activated again. + More importantly though, I think this makes it clearer who owns the thread + state - a manually created one is controlled by the code that created it, + and once it's deleted it can't be activated again. This was ultimately rejected for two reasons: -1. The proposed API has closer usage to :c:func:`PyGILState_Ensure` / :c:func:`PyGILState_Release`, which helps ease the transition for old codebases. -2. It's `significantly easier `_ for code-generators like Cython to use, as there isn't any additional complexity with tracking :c:type:`PyThreadState` pointers around. +1. The proposed API has closer usage to + :c:func:`PyGILState_Ensure` / :c:func:`PyGILState_Release`, which helps + ease the transition for old codebases. +2. It's `significantly easier `_ + for code-generators like Cython to use, as there isn't any additional + complexity with tracking :c:type:`PyThreadState` pointers around. Open Issues =========== @@ -371,14 +567,27 @@ Open Issues Use ``PyStatus`` for the return value of ``PyThreadState_Ensure``? ------------------------------------------------------------------ -:c:func:`PyThreadState_Ensure` returns an integer to return failures, but some iterations have suggested the use of :c:type:`PyStatus` to denote failure, which has the benefit of providing an error message. The main hesitation for switching to ``PyStatus`` is that it's more difficult to use, as the ``PyStatus`` has to be stored and checked, whereas a simple integer can simply be used inline with an ``if`` clause. +:c:func:`PyThreadState_Ensure` returns an integer to return failures, but some +iterations have suggested the use of :c:type:`PyStatus` to denote failure, +which has the benefit of providing an error message. The main hesitation for +switching to ``PyStatus`` is that it's more difficult to use, as the +``PyStatus`` has to be stored and checked, whereas a simple integer can simply +be used inline with an ``if`` clause. -Additionally, it's `not clear `_ that an error message would be all that useful; all the conceived use-cases for this API wouldn't really care about a message indicating why Python can't be invoked. +Additionally, it's +`not clear `_ +that an error message would be all that useful; all the conceived use-cases +for this API wouldn't really care about a message indicating why Python can't +be invoked. When should ``PyGILState`` be removed? -------------------------------------- -:c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release` have been around for over two decades, and it's expected that the migration will be difficult. Currently, the plan is to remove them in 10 years (opposed to the 5 years required by :pep:`387`), but this is subject to further discussion, as it's unclear if that's enough (or too much) time. +:c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release` have been around +for over two decades, and it's expected that the migration will be difficult. +Currently, the plan is to remove them in 10 years (opposed to the 5 years +required by :pep:`387`), but this is subject to further discussion, as it's +unclear if that's enough (or too much) time. Footnotes ========= From d797f7c99a48eee743337699394facb2470e7ea3 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 16:18:14 -0400 Subject: [PATCH 04/51] Fix missing wrap. --- peps/pep-0788.rst | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index e43a64bc0b5..ca9215fae49 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -7,7 +7,7 @@ Status: Draft Type: Standards Track Created: 23-Apr-2025 Python-Version: 3.15 -Post-History: `10-Mar-2025 `_ +Post-History: `10-Mar-2025 `__ Abstract @@ -43,10 +43,10 @@ correct and safer replacement for :c:func:`PyGILState_Ensure` and fputs("Python is shutting down", stderr); return; } - + /* Interact with Python, without worrying about finalization. */ // ... - + PyThreadState_Release(); Motivation @@ -293,7 +293,7 @@ prevent the interpreter from finalizing. accessed by another thread). This function is generally meant to be used in tandem with - :c:func:`PyThreadState_Ensure`. + :c:func:`PyThreadState_Ensure`. The caller must have an `attached thread state`_, and cannot return a failure. @@ -336,7 +336,7 @@ remain daemon by default. thread state must not be the main thread for the interpreter. All thread states created without :c:func:`PyThreadState_Ensure` are daemon by default. - + If the thread state is non-daemon, then the current interpreter will wait for this thread to finish before shutting down. See also :meth:`threading.Thread.setDaemon`. @@ -347,7 +347,8 @@ remain daemon by default. Ensuring and releasing thread states ------------------------------------ -This proposal includes two new high-level threading APIs that intend to replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. +This proposal includes two new high-level threading APIs that intend to +replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. .. c:function:: int PyThreadState_Ensure(PyInterpreterState *interp) @@ -458,9 +459,9 @@ The following code uses the old ``PyGILState`` APIs: thread_func(void *arg) { PyGILState_STATE gstate = PyGILState_Ensure(); - /* It's not an issue in this example, but we just attached + /* It's not an issue in this example, but we just attached a thread state for the main interpreter. If my_method() was - originally called in a subinterpreter, then we would be unable + originally called in a subinterpreter, then we would be unable to safely interact with any objects from it. */ if (PyRun_SimpleString("print(42)") < 0) { PyErr_Print(); From 3bb34b3f4b09bc2828b867670f19b57c64441822 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 16:19:00 -0400 Subject: [PATCH 05/51] Add codeowner --- .github/CODEOWNERS | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index fe91721736f..5230e1fab61 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -666,6 +666,7 @@ peps/pep-0784.rst @gpshead peps/pep-0785.rst @gpshead # ... peps/pep-0787.rst @ncoghlan +peps/pep-0788.rst @ZeroIntensity @vstinner # ... peps/pep-0789.rst @njsmith # ... From c362dbb376887e1e972953d9ec23d11792d9477e Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 19:20:41 -0400 Subject: [PATCH 06/51] Apply suggestions from code review Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> --- peps/pep-0788.rst | 45 +++++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index ca9215fae49..7d060914be6 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -7,15 +7,15 @@ Status: Draft Type: Standards Track Created: 23-Apr-2025 Python-Version: 3.15 -Post-History: `10-Mar-2025 `__ +Post-History: `10-Mar-2025 `__ Abstract ======== :c:func:`PyGILState_Ensure`, :c:func:`PyGILState_Ensure`, and other related -functions in the ``PyGILState`` family, are the most common way to create -native threads that interact with Python, and have been the standard for over +functions in the ``PyGILState`` family are the most common way to create +native threads that interact with Python. They have been the standard for over twenty years (:pep:`311`). But, as Python has grown, these functions have become problematic: @@ -32,22 +32,22 @@ become problematic: - The term "GIL" in the name is quite confusing for users of free-threaded Python. There isn't a GIL, why do they still have to call it? -This PEP intends to fix all of these issues by providing -:c:func:`PyThreadState_Ensure` and :c:func:`PyThreadState_Release` as a more +This PEP intends to fix all of these issues by providing two new functions, +:c:func:`PyThreadState_Ensure` and :c:func:`PyThreadState_Release`, as a more correct and safer replacement for :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. For example: .. code-block:: C - if (PyThreadState_Ensure(interp) < 0) { - fputs("Python is shutting down", stderr); - return; - } + if (PyThreadState_Ensure(interp) < 0) { + fputs("Python is shutting down", stderr); + return; + } - /* Interact with Python, without worrying about finalization. */ - // ... + /* Interact with Python, without worrying about finalization. */ + // ... - PyThreadState_Release(); + PyThreadState_Release(); Motivation ========== @@ -58,7 +58,7 @@ Native threads will always hang during finalization Many codebases might need to call Python code in highly-asynchronous situations where the interpreter is already finalizing, or might finalize, and want to continue running code after the Python call. This desire has been -`brought up by users `_. +`brought up by users `_. For example, a callback that wants to call Python code might be invoked by: - A kernel has finished running on the GPU @@ -66,8 +66,8 @@ For example, a callback that wants to call Python code might be invoked by: - A thread has quit, and the C++ library is executing static finalizers of thread local storage. -In the current C API, any non-Python thread (*i.e.*, not created by -:mod:`threading`) is considered to be "daemon," meaning that the interpreter +In the current C API, any non-Python thread (one not created via the +:mod:`threading` module) is considered to be "daemon," meaning that the interpreter won't wait on that thread to finalize. Instead, the interpreter will hang the thread when it goes to `attach `_ a `thread state`_, making it unusable past that point. Attaching a thread state can happen at @@ -141,7 +141,7 @@ attach a thread state). If the interpreter is long dead, then Python obviously can't give a thread a way to invoke it. :c:func:`PyGILState_Ensure` doesn't have any meaningful way to return a failure, so it has no choice but to terminate the thread or emit a fatal -error, as noted in `gh-124622 `_: +error, as noted in `python/cpython#124622 `_: I think a new GIL acquisition and release C API would be needed. The way the existing ones get used in existing C code is not amenible to suddenly @@ -173,8 +173,9 @@ the existing crashes caused by :c:func:`PyGILState_Ensure`. ******************************************* A large issue with the term "GIL" in the C API is that it's semantically -misleading, as noted in `gh-127989 `_ -(disclaimer: the author of this PEP also authored that issue): +misleading, as noted in `python/cpython#127989 +`_, +created by the authors of this PEP: The biggest issue is that for free-threading, there is no GIL, so users erroneously call the C API inside ``Py_BEGIN_ALLOW_THREADS`` blocks or @@ -246,7 +247,7 @@ this namespace seems to fit well for the functions in this PEP. Full deprecation of ``PyGILState`` ---------------------------------- -As made clear in the motivation, ``PyGILState`` is already pretty buggy, and +As made clear in Motivation_, ``PyGILState`` is already pretty buggy, and even if it was magically fixed, the current behavior of hanging the thread is beyond repair. As such, this PEP intends to completely deprecate the existing ``PyGILState`` APIs. However, even if this PEP is rejected, all of the APIs @@ -394,7 +395,7 @@ new ``PyThreadState`` APIs for the reasons given in the motivation. Namely: instead. All of the ``PyGILState`` APIs are to be removed from the non-limited C API in -Python 3.25. They will remain available in the stable API for compatibility. +Python 3.25. They will remain available in the stable ABI for compatibility. Backwards Compatibility ======================= @@ -558,7 +559,7 @@ This was ultimately rejected for two reasons: 1. The proposed API has closer usage to :c:func:`PyGILState_Ensure` / :c:func:`PyGILState_Release`, which helps ease the transition for old codebases. -2. It's `significantly easier `_ +2. It's `significantly easier `_ for code-generators like Cython to use, as there isn't any additional complexity with tracking :c:type:`PyThreadState` pointers around. @@ -576,7 +577,7 @@ switching to ``PyStatus`` is that it's more difficult to use, as the be used inline with an ``if`` clause. Additionally, it's -`not clear `_ +`not clear `_ that an error message would be all that useful; all the conceived use-cases for this API wouldn't really care about a message indicating why Python can't be invoked. From 9b0d0ca1a5f3dcf833737c1a680b7a503ec43b45 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 19:43:30 -0400 Subject: [PATCH 07/51] Update pep-0788.rst Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> --- peps/pep-0788.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 7d060914be6..51be1e59aba 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -19,7 +19,7 @@ native threads that interact with Python. They have been the standard for over twenty years (:pep:`311`). But, as Python has grown, these functions have become problematic: -- They aren't safe for finalization, either hanging the calling thread or +- They aren't safe for finalization, either causing the calling thread to hang or crashing it with a segmentation fault, preventing further execution. - When they're called before finalization, they force the thread to be "daemon," meaning that the interpreter won't wait for it to reach any point From d4faacc0263c1000e496d83e7b8531bac23948f6 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 19:43:40 -0400 Subject: [PATCH 08/51] Update pep-0788.rst Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> --- peps/pep-0788.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 51be1e59aba..4ba0b72b756 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -83,7 +83,7 @@ code in their stream of calls (for example, C++ executing finalizers in ``Py_IsFinalizing`` is insufficient *********************************** -The `docs `_ +The :c:func:`docs ` currently recommend :c:func:`Py_IsFinalizing` to guard against termination of the thread: From 2b01e94a286724670935014636ed223c3bd2eebb Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 19:49:43 -0400 Subject: [PATCH 09/51] Fix GIL terms. --- peps/pep-0788.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 4ba0b72b756..8d6f5479086 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -103,8 +103,8 @@ Daemon threads can cause finalization deadlocks *********************************************** When acquiring locks, it's extremely important to detach the thread state to -prevent deadlocks. This is true on both the GIL-ful and free-threaded builds. -In a GIL-icious build, a deadlock can occur pretty easily when acquiring a +prevent deadlocks. This is true on both the with-GIL and free-threaded builds. +When the GIL is enabled, a deadlock can occur pretty easily when acquiring a lock if the GIL wasn't released, and lock-ordering deadlocks can still occur free-threaded builds if the thread state wasn't detached. @@ -183,7 +183,7 @@ created by the authors of this PEP: In reality, :c:func:`PyGILState_Ensure` doesn't just "acquire the GIL" in modern versions. It attaches a `thread state`_ for the current -thread--*that's* what lets a thread invoke the C API. On GIL-ful builds, +thread--*that's* what lets a thread invoke the C API. On with-GIL builds, holding an `attached thread state`_ implies holding the GIL, so only one thread can have one at a time. Free-threaded builds achieve the effect of multi-core parallism while remaining backwards-compatible by simply removing @@ -324,7 +324,7 @@ or in between bytecode instructions, based on :func:`sys.setswitchinterval`. A new, internal field will be added to the ``PyThreadState`` structure that determines if the thread is daemon. If the thread is daemon, then it will hang during attachment as usual, but if it's not, then the interpreter will -let the thread attach and continue execution. On GIL-ful builds, this again +let the thread attach and continue execution. On with-GIL builds, this again means handing off the GIL to the thread. During finalization, the interpreter will wait until all non-daemon threads call :c:func:`PyThreadState_Delete`. From ddd03fe6bccbb59c75f4f1e82942432d0710b7ea Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 19:53:28 -0400 Subject: [PATCH 10/51] Use term instead of link. --- peps/pep-0788.rst | 30 ++++++++++++------------------ 1 file changed, 12 insertions(+), 18 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 8d6f5479086..889242214db 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -69,7 +69,7 @@ For example, a callback that wants to call Python code might be invoked by: In the current C API, any non-Python thread (one not created via the :mod:`threading` module) is considered to be "daemon," meaning that the interpreter won't wait on that thread to finalize. Instead, the interpreter will hang the -thread when it goes to `attach `_ a `thread state`_, +thread when it goes to :term:`attach ` a :term:`thread state`, making it unusable past that point. Attaching a thread state can happen at any point when invoking Python, such as releasing the GIL in-between bytecode instructions, or when a C function exits a :c:macro:`Py_BEGIN_ALLOW_THREADS` @@ -154,7 +154,7 @@ error, as noted in `python/cpython#124622 `_. @@ -182,9 +182,9 @@ created by the authors of this PEP: omit ``PyGILState_Ensure`` in fresh threads. In reality, :c:func:`PyGILState_Ensure` doesn't just "acquire the GIL" in -modern versions. It attaches a `thread state`_ for the current +modern versions. It attaches a :term:`thread state` for the current thread--*that's* what lets a thread invoke the C API. On with-GIL builds, -holding an `attached thread state`_ implies holding the GIL, so only one +holding an :term:`attached thread state` implies holding the GIL, so only one thread can have one at a time. Free-threaded builds achieve the effect of multi-core parallism while remaining backwards-compatible by simply removing that limitation: threads still need a thread state (and thus need to call @@ -296,7 +296,7 @@ prevent the interpreter from finalizing. This function is generally meant to be used in tandem with :c:func:`PyThreadState_Ensure`. - The caller must have an `attached thread state`_, and cannot return a + The caller must have an :term:`attached thread state`, and cannot return a failure. @@ -308,7 +308,7 @@ prevent the interpreter from finalizing. :c:func:`PyThreadState_Ensure`. This function cannot fail, other than with a fatal error. The caller must - have an `attached thread state`_ for *interp*. + have an :term:`attached thread state` for *interp*. Daemon and non-daemon threads @@ -333,7 +333,7 @@ remain daemon by default. .. c:function:: int PyThreadState_SetDaemon(int is_daemon) - Set the `attached thread state`_ as non-daemon or daemon. The attached + Set the :term:`attached thread state` as non-daemon or daemon. The attached thread state must not be the main thread for the interpreter. All thread states created without :c:func:`PyThreadState_Ensure` are daemon by default. @@ -353,7 +353,7 @@ replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. .. c:function:: int PyThreadState_Ensure(PyInterpreterState *interp) - Ensure that the thread has an `attached thread state`_ for *interp*, and + Ensure that the thread has an :term:`attached thread state` for *interp*, and thus can safely invoke that interpreter. It is OK to call this function if the thread already has an attached thread state, as long as there is a subsequent call to :c:func:`PyThreadState_Release` that matches this one. @@ -364,21 +364,21 @@ replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. Thread states created by this function are non-daemon by default. See :c:func:`PyThreadState_SetDaemon`. If the calling thread already has an - `attached thread state`_ that matches *interp*, then this function will + :term:`attached thread state` that matches *interp*, then this function will simply mark the existing thread state as non-daemon and return. It will be restored to its prior daemon status upon the next :c:func:`PyThreadState_Release` call. - Return zero on success, and non-zero with the old `attached thread state`_ + Return zero on success, and non-zero with the old :term:`attached thread state` restored (which may have been ``NULL``). .. c:function:: void PyThreadState_Release() - Detach and destroy the `attached thread state`_ set by + Detach and destroy the :term:`attached thread state` set by :c:func:`PyThreadState_Ensure`. This function cannot fail, but may hang the thread if the - `attached thread state`_ prior to the original :c:func:`PyThreadState_Ensure` + :term:`attached thread state` prior to the original :c:func:`PyThreadState_Ensure` was daemon. Deprecation of ``PyGILState`` @@ -591,12 +591,6 @@ Currently, the plan is to remove them in 10 years (opposed to the 5 years required by :pep:`387`), but this is subject to further discussion, as it's unclear if that's enough (or too much) time. -Footnotes -========= - -.. _Thread State: https://docs.python.org/3.14/glossary.html#term-thread-state -.. _Attached Thread State: https://docs.python.org/3.14/glossary.html#term-attached-thread-state - Copyright ========= From 13fa4f7ca459efad4fb2e10d77d0ee9e7f8d38d2 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 19:56:02 -0400 Subject: [PATCH 11/51] Use references in titles. --- peps/pep-0788.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 889242214db..c96b10e9379 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -132,8 +132,8 @@ finalization, because a daemon thread got hung while holding the lock. There are workarounds for this for pure-Python code, but native threads don't have such an option. -We can't change finalization behavior for ``PyGILState_Ensure`` -*************************************************************** +We can't change finalization behavior for :c:func:`PyGILState_Ensure` +********************************************************************* There will always have to be a point in a Python program where :c:func:`PyGILState_Ensure` can no longer acquire the GIL (or more correctly, @@ -158,8 +158,8 @@ There are currently two public ways for a user to create and attach their own and :c:func:`PyGILState_Ensure`. The former, :c:func:`PyGILState_Ensure`, is `significantly more common `_. -``PyGILState`` generally crashes during finalization -**************************************************** +:c:func:`PyGILState_Ensure` generally crashes during finalization +***************************************************************** As of this PEP, the current behavior of :c:func:`PyGILState_Ensure` does not match the documentation. Instead of hanging the thread during finalization @@ -191,8 +191,8 @@ that limitation: threads still need a thread state (and thus need to call :c:func:`PyGILState_Ensure`), but they don't need to wait on one another to do so. -Subinterpreters don't work with ``PyGILState`` ----------------------------------------------- +Subinterpreters don't work with :c:func:`PyGILState_Ensure` +----------------------------------------------------------- As noted in the `documentation `_, @@ -566,8 +566,8 @@ This was ultimately rejected for two reasons: Open Issues =========== -Use ``PyStatus`` for the return value of ``PyThreadState_Ensure``? ------------------------------------------------------------------- +Use ``PyStatus`` for the return value of :c:func:`PyThreadState_Ensure`? +------------------------------------------------------------------------ :c:func:`PyThreadState_Ensure` returns an integer to return failures, but some iterations have suggested the use of :c:type:`PyStatus` to denote failure, From 8376d4ba6044e3aa063f91749eb8c622ffcbef91 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 20:01:33 -0400 Subject: [PATCH 12/51] Rename the reference. --- peps/pep-0788.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index c96b10e9379..dbc00f558b2 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -266,7 +266,7 @@ This API intentionally has a layer of "magic" that is kept from the user, for simplicity's sake in the transition from ``PyGILState`` and for ease-of-use on those that wrap the C API, such as in Cython or PyO3. -See also :ref:`Activate Deactivate Instead`. +See also :ref:`pep-788-activate-deactivate-instead`. Specification ============= @@ -540,7 +540,7 @@ deleted and cause use-after-free violations. However, :c:func:`PyInterpreterState_Hold` prevents the interpreter from finalizing until :c:func:`PyThreadState_Ensure` is called anyway. -.. _Activate Deactivate Instead: +.. _pep-788-activate-deactivate-instead: Exposing an ``Activate``/``Deactivate`` API instead of ``Ensure``/``Clear`` --------------------------------------------------------------------------- From 0268736fb7912c95f0f1d8be7095dfba8b241630 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 20:30:00 -0400 Subject: [PATCH 13/51] Some adjustments. --- peps/pep-0788.rst | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index dbc00f558b2..a1b6838124c 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -194,8 +194,7 @@ do so. Subinterpreters don't work with :c:func:`PyGILState_Ensure` ----------------------------------------------------------- -As noted in the -`documentation `_, +As noted in the `documentation `_, ``PyGILState`` APIs aren't officially supported in subinterpreters: Note that the ``PyGILState_*`` functions assume there is only one global @@ -211,6 +210,7 @@ runtime, so spurious races are bound to come up in threads created by subinterpreters, because synchronization for the wrong interpreter will be used on objects shared between the threads. + Interpreters can concurrently shut down *************************************** @@ -239,10 +239,9 @@ Bikeshedding and the ``PyThreadState`` namespace To solve the issue with "GIL" terminology, the new functions intended as replacements for ``PyGILState`` will go under the existing ``PyThreadState`` -namespace. In Python 3.14, the documentation has been -`updated `_ to switch over to -terms using "thread state" instead of "global interpreter lock" or "GIL," so -this namespace seems to fit well for the functions in this PEP. +namespace. In Python 3.14, the documentation has been updated to switch +over to terms using "thread state" instead of "global interpreter lock" +or "GIL," so this namespace seems to fit well for the functions in this PEP. Full deprecation of ``PyGILState`` ---------------------------------- @@ -266,7 +265,7 @@ This API intentionally has a layer of "magic" that is kept from the user, for simplicity's sake in the transition from ``PyGILState`` and for ease-of-use on those that wrap the C API, such as in Cython or PyO3. -See also :ref:`pep-788-activate-deactivate-instead`. +See also `pep-788-activate-deactivate-instead`_. Specification ============= @@ -339,7 +338,8 @@ remain daemon by default. default. If the thread state is non-daemon, then the current interpreter will wait - for this thread to finish before shutting down. See also :meth:`threading.Thread.setDaemon`. + for this thread to finish before shutting down. See also + :meth:`threading.Thread.setDaemon`. Return zero on success, non-zero *without* an exception set on failure. Failure generally means that threads have already finalized for the @@ -409,11 +409,11 @@ How to Teach This As with all C API functions, all the new APIs in this PEP will be documented in the C API documentation, ideally under the -`Non-Python created threads `_ -section. The existing -`High level API `_ -section, containing most of the ``PyGILState`` documentation, should be -updated accordingly to point to the new APIs. +`Non-Python created threads `_ section. +The existing `High-level API `_ section, containing most +of the ``PyGILState`` documentation, should be updated accordingly to point +to the new APIs. + Examples -------- @@ -546,14 +546,16 @@ Exposing an ``Activate``/``Deactivate`` API instead of ``Ensure``/``Clear`` --------------------------------------------------------------------------- In prior discussions of this API, it was -`suggested `_ -to provide actual :c:type:`PyThreadState` pointers in the API in an attempt to +`suggested `_ to provide actual +:c:type:`PyThreadState` pointers in the API in an attempt to make the ownership and lifetime of the thread state clearer: More importantly though, I think this makes it clearer who owns the thread state - a manually created one is controlled by the code that created it, and once it's deleted it can't be activated again. +.. _using-activate-deactivate-dpo: https://discuss.python.org/t/a-new-api-for-ensuring-releasing-thread-states/83959/2 + This was ultimately rejected for two reasons: 1. The proposed API has closer usage to From 76d043609ac37a565084d35cfb9b777bef22ced7 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 20:41:44 -0400 Subject: [PATCH 14/51] Change vague phrase. --- peps/pep-0788.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index a1b6838124c..b4ea842c09a 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -56,8 +56,8 @@ Native threads will always hang during finalization --------------------------------------------------- Many codebases might need to call Python code in highly-asynchronous -situations where the interpreter is already finalizing, or might finalize, and want to continue -running code after the Python call. This desire has been +situations where the interpreter is already finalizing, or might finalize, and +want to continue running code after the Python call. This desire has been `brought up by users `_. For example, a callback that wants to call Python code might be invoked by: From 7607eaa92472dca5e57d5ef755c6aa1f5e6581ce Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 20:43:00 -0400 Subject: [PATCH 15/51] Update peps/pep-0788.rst Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> --- peps/pep-0788.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index b4ea842c09a..4a400282d04 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -59,11 +59,11 @@ Many codebases might need to call Python code in highly-asynchronous situations where the interpreter is already finalizing, or might finalize, and want to continue running code after the Python call. This desire has been `brought up by users `_. -For example, a callback that wants to call Python code might be invoked by: +For example, a callback that wants to call Python code might be invoked when: -- A kernel has finished running on the GPU +- A kernel has finished running on a GPU - A network packet was received. -- A thread has quit, and the C++ library is executing static finalizers of +- A thread has quit, and the native library is executing static finalizers of thread local storage. In the current C API, any non-Python thread (one not created via the From 16c53224d599673f51a8347c0baa109c17561516 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Wed, 23 Apr 2025 20:45:06 -0400 Subject: [PATCH 16/51] Fix wording. --- peps/pep-0788.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 4a400282d04..d90b172f931 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -61,9 +61,9 @@ want to continue running code after the Python call. This desire has been `brought up by users `_. For example, a callback that wants to call Python code might be invoked when: -- A kernel has finished running on a GPU +- A kernel has finished running on a GPU. - A network packet was received. -- A thread has quit, and the native library is executing static finalizers of +- A thread has quit, and a native library is executing static finalizers of thread local storage. In the current C API, any non-Python thread (one not created via the From d412342dba7ed0ef49d0cd5f2e8a2fc85b322495 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 06:54:52 -0400 Subject: [PATCH 17/51] Use a reference for Py_IsFinalizing() --- peps/pep-0788.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index d90b172f931..678070b8763 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -80,8 +80,8 @@ is severely limiting for users who want to do more than just execute Python code in their stream of calls (for example, C++ executing finalizers in *addition* to calling Python). -``Py_IsFinalizing`` is insufficient -*********************************** +:c:func:`Py_IsFinalizing`` is insufficient +****************************************** The :c:func:`docs ` currently recommend :c:func:`Py_IsFinalizing` to guard against termination of From 06c1c779251bfa098d08c58cf7d192e6affb9589 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 06:57:59 -0400 Subject: [PATCH 18/51] Fix typo and mention the daemon attribute instead of setDaemon. --- peps/pep-0788.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 678070b8763..86c8b97d844 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -80,8 +80,8 @@ is severely limiting for users who want to do more than just execute Python code in their stream of calls (for example, C++ executing finalizers in *addition* to calling Python). -:c:func:`Py_IsFinalizing`` is insufficient -****************************************** +:c:func:`Py_IsFinalizing` is insufficient +***************************************** The :c:func:`docs ` currently recommend :c:func:`Py_IsFinalizing` to guard against termination of @@ -339,7 +339,7 @@ remain daemon by default. If the thread state is non-daemon, then the current interpreter will wait for this thread to finish before shutting down. See also - :meth:`threading.Thread.setDaemon`. + :attr:`threading.Thread.daemon`. Return zero on success, non-zero *without* an exception set on failure. Failure generally means that threads have already finalized for the From d57e1eeace84afbbd066dee06688e8cafe61b3f7 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 07:01:40 -0400 Subject: [PATCH 19/51] Some heading changes. --- peps/pep-0788.rst | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 86c8b97d844..bc826db194b 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -80,8 +80,8 @@ is severely limiting for users who want to do more than just execute Python code in their stream of calls (for example, C++ executing finalizers in *addition* to calling Python). -:c:func:`Py_IsFinalizing` is insufficient -***************************************** +Using :c:func:`Py_IsFinalizing` is insufficient +*********************************************** The :c:func:`docs ` currently recommend :c:func:`Py_IsFinalizing` to guard against termination of @@ -243,8 +243,8 @@ namespace. In Python 3.14, the documentation has been updated to switch over to terms using "thread state" instead of "global interpreter lock" or "GIL," so this namespace seems to fit well for the functions in this PEP. -Full deprecation of ``PyGILState`` ----------------------------------- +Full deprecation of ``PyGILState`` APIs +--------------------------------------- As made clear in Motivation_, ``PyGILState`` is already pretty buggy, and even if it was magically fixed, the current behavior of hanging the thread is @@ -265,7 +265,7 @@ This API intentionally has a layer of "magic" that is kept from the user, for simplicity's sake in the transition from ``PyGILState`` and for ease-of-use on those that wrap the C API, such as in Cython or PyO3. -See also `pep-788-activate-deactivate-instead`_. +See also :ref:`pep-788-activate-deactivate-instead`. Specification ============= @@ -381,8 +381,8 @@ replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. :term:`attached thread state` prior to the original :c:func:`PyThreadState_Ensure` was daemon. -Deprecation of ``PyGILState`` ------------------------------ +Deprecation of ``PyGILState`` APIs +---------------------------------- This PEP deprecates all of the existing ``PyGILState`` APIs in favor of the new ``PyThreadState`` APIs for the reasons given in the motivation. Namely: From 92e575d574f49b970a16f1d6fbb5c062e147d126 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 07:10:28 -0400 Subject: [PATCH 20/51] Fix a hyperlink. --- peps/pep-0788.rst | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index bc826db194b..1331cad4c7d 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -546,7 +546,7 @@ Exposing an ``Activate``/``Deactivate`` API instead of ``Ensure``/``Clear`` --------------------------------------------------------------------------- In prior discussions of this API, it was -`suggested `_ to provide actual +`suggested `_ to provide actual :c:type:`PyThreadState` pointers in the API in an attempt to make the ownership and lifetime of the thread state clearer: @@ -554,8 +554,6 @@ make the ownership and lifetime of the thread state clearer: state - a manually created one is controlled by the code that created it, and once it's deleted it can't be activated again. -.. _using-activate-deactivate-dpo: https://discuss.python.org/t/a-new-api-for-ensuring-releasing-thread-states/83959/2 - This was ultimately rejected for two reasons: 1. The proposed API has closer usage to From 8da72571ab2230aa93d272612a74e93c26866237 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 07:34:22 -0400 Subject: [PATCH 21/51] Change a few things around. --- peps/pep-0788.rst | 73 ++++++++++++++++++++++++++++++----------------- 1 file changed, 47 insertions(+), 26 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 1331cad4c7d..5924201b7c5 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -150,8 +150,8 @@ error, as noted in `python/cpython#124622 ` instead of +:term:`"global interpreter lock" `, so this namespace +seems to fit well for the functions in this PEP. -Full deprecation of ``PyGILState`` APIs ---------------------------------------- +Full deprecation of old APIs +---------------------------- As made clear in Motivation_, ``PyGILState`` is already pretty buggy, and even if it was magically fixed, the current behavior of hanging the thread is @@ -258,14 +260,14 @@ C API: - :c:func:`PyGILState_GetThisThreadState`: :c:func:`PyThreadState_Get` - :c:func:`PyGILState_Check`: ``PyThreadState_GetUnchecked() != NULL`` -Hiding away thread state details --------------------------------- +A light layer of magic +---------------------- -This API intentionally has a layer of "magic" that is kept from the user, for -simplicity's sake in the transition from ``PyGILState`` and for ease-of-use on -those that wrap the C API, such as in Cython or PyO3. - -See also :ref:`pep-788-activate-deactivate-instead`. +The APIs proposed by this PEP intentionally have a layer of "magic" that is +kept from the user and offloads complexity onto CPython maintainers. This is +done primarily to help ease the transition from ``PyGILState`` for existing +codebases, and for ease-of-use to those who provide wrappers the C API, such +as Cython or PyO3. See also :ref:`pep-788-activate-deactivate-instead`. Specification ============= @@ -449,8 +451,8 @@ Single-threaded example Py_RETURN_NONE; } -Transition from ``PyGILState`` example -************************************** +Transitioning from old functions +******************************** The following code uses the old ``PyGILState`` APIs: @@ -535,10 +537,29 @@ Using an interpreter ID instead of a interpreter state Some iterations of this API took an ``int64_t interp_id`` parameter instead of ``PyInterpreterState *interp``, because interpreter IDs cannot be concurrently -deleted and cause use-after-free violations. However, -:c:type:`PyInterpreterState` pointers are a lot simpler to use, and -:c:func:`PyInterpreterState_Hold` prevents the interpreter from finalizing -until :c:func:`PyThreadState_Ensure` is called anyway. +deleted and cause use-after-free violations. :c:func:`PyInterpreterState_Hold` +fixes this issue anyway, but an interpreter ID does have the benefit of +requiring less magic in the implementation, but has several downsides: + +1. Nearly all existing APIs already return a :c:type:`PyInterpreterState` + pointer, not an interpreter ID. Functions like + :c:func:`PyThreadState_GetInterpreter` would have to be accompanied by + frustrating calls to :c:func:`PyInterpreterState_GetID`. There's also + no existing way to go from an ``int64_t`` back to a + :c:expr:`PyInterpreterState *`, and providing such an API would come + with its own set of design problems. +2. Threads typically take a ``void *arg`` parameter, not an ``int64_t arg``. + As such, passing an interpreter pointer requires much less boilerplate + for the user, because an additional structure definition or heap allocation + would be needed to store the interpreter ID. +3. To retain usability, interpreter ID APIs would still need to keep a + reference count, otherwise the interpreter could be finalizing before + the native thread gets a chance to attach. The problem with using an + interpreter ID is that the reference count has to be "invisible"; it + must be tracked elsewhere in the interpreter, likely being *more* + complex than :c:func:`PyInterpreterState_Hold`. There's also a lack + of intuition that a standalone integer could have such a thing as + a reference count. .. _pep-788-activate-deactivate-instead: @@ -582,8 +603,8 @@ that an error message would be all that useful; all the conceived use-cases for this API wouldn't really care about a message indicating why Python can't be invoked. -When should ``PyGILState`` be removed? --------------------------------------- +When should the legacy APIs be removed? +--------------------------------------- :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release` have been around for over two decades, and it's expected that the migration will be difficult. From 7472a913c2d742ce22b322fe75ae63ce2088df7f Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 07:37:59 -0400 Subject: [PATCH 22/51] Fix vague introduction. --- peps/pep-0788.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 5924201b7c5..7ba0c12a9df 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -16,7 +16,7 @@ Abstract :c:func:`PyGILState_Ensure`, :c:func:`PyGILState_Ensure`, and other related functions in the ``PyGILState`` family are the most common way to create native threads that interact with Python. They have been the standard for over -twenty years (:pep:`311`). But, as Python has grown, these functions have +twenty years (:pep:`311`). But, over time, these functions have become problematic: - They aren't safe for finalization, either causing the calling thread to hang or From e5ed56a4bf54dd9c4f630a20575d70ccd38286be Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 16:06:45 -0400 Subject: [PATCH 23/51] Add security implications and update example. --- peps/pep-0788.rst | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 7ba0c12a9df..fcbd408addc 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -406,6 +406,11 @@ This PEP specifies a breaking change with the removal of all the ``PyGILState`` APIs from the public headers of the non-limited C API in 10 years (Python 3.25). +Security Implications +===================== + +This PEP has no known security implications. + How to Teach This ================= @@ -426,6 +431,12 @@ Ideally, they could be reused in the documentation. Single-threaded example *********************** +This example shows acquiring a lock in a Python method. + +If this were to be called from a daemon thread, then the interpreter could +hang the thread while reattaching the thread state, leaving us with the lock +held. Any future finalizer that wanted to acquire the lock would be deadlocked! + .. code-block:: C static PyObject * @@ -440,13 +451,12 @@ Single-threaded example Py_BEGIN_ALLOW_THREADS; acquire_some_lock(); - /* If this were to be a daemon thread, then the interpreter could - hang the thread while reattaching the thread state, leaving us - with the lock held. Any future finalizer that wanted to acquire the - lock would be deadlocked! - */ Py_END_ALLOW_THREADS; + /* Do something with the lock */ + // ... + + release_some_lock(); PyThreadState_Release(); Py_RETURN_NONE; } From 28761f13c5a8a22c3b1d3840cd5f95f15fac8a34 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 16:09:58 -0400 Subject: [PATCH 24/51] Add some comments and fixes to the examples. --- peps/pep-0788.rst | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index fcbd408addc..b2b8243a726 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -444,8 +444,11 @@ held. Any future finalizer that wanted to acquire the lock would be deadlocked! { assert(PyThreadState_GetUnchecked() != NULL); PyInterpreterState *interp = PyInterpreterState_Hold(); + /* Temporarily make this thread non-daemon to ensure that the + lock is released. */ if (PyThreadState_Ensure(interp) < 0) { - PyErr_SetString(PyExc_RuntimeError, "interpreter is shutting down"); + PyErr_SetString(PyExc_PythonFinalizationError, + "interpreter is shutting down"); return NULL; } @@ -453,7 +456,7 @@ held. Any future finalizer that wanted to acquire the lock would be deadlocked! acquire_some_lock(); Py_END_ALLOW_THREADS; - /* Do something with the lock */ + /* Do something while holding the lock */ // ... release_some_lock(); @@ -525,6 +528,7 @@ This is the same code, updated to use the new functions: PyInterpreterState *interp = PyInterpreterState_Hold(); if (PyThread_start_joinable_thread(thread_func, interp, &ident, &handle) < 0) { + PyInterpreterState_Release(interp); return NULL; } Py_BEGIN_ALLOW_THREADS From a1ccd02702c8e891b08736c73303e847048d5e85 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 16:14:02 -0400 Subject: [PATCH 25/51] Apply suggestions from code review Co-authored-by: Victor Stinner --- peps/pep-0788.rst | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index b2b8243a726..3afbfd0a16b 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -297,16 +297,12 @@ prevent the interpreter from finalizing. This function is generally meant to be used in tandem with :c:func:`PyThreadState_Ensure`. - The caller must have an :term:`attached thread state`, and cannot return a - failure. + The caller must have an :term:`attached thread state`. .. c:function:: void PyInterpreterState_Release(PyInterpreterState *interp) - Decrement the reference count of the interpreter. This function mainly - exists for completeness, and should rarely be used; nearly all references - returned by :c:func:`PyInterpreterState_Hold` should be released by - :c:func:`PyThreadState_Ensure`. + Decrement the reference count of the interpreter. This function cannot fail, other than with a fatal error. The caller must have an :term:`attached thread state` for *interp*. @@ -334,7 +330,7 @@ remain daemon by default. .. c:function:: int PyThreadState_SetDaemon(int is_daemon) - Set the :term:`attached thread state` as non-daemon or daemon. The attached + Set the daemon status of the :term:`attached thread state`. The attached thread state must not be the main thread for the interpreter. All thread states created without :c:func:`PyThreadState_Ensure` are daemon by default. @@ -360,8 +356,7 @@ replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. the thread already has an attached thread state, as long as there is a subsequent call to :c:func:`PyThreadState_Release` that matches this one. - This function always steals a reference to *interp*; as in, the - interpreter's reference count is decremented by one. As such, *interp* + The interpreter's *interp* reference count is decremented by one. As such, *interp* should have been acquired by :c:func:`PyInterpreterState_Hold`. Thread states created by this function are non-daemon by default. See @@ -381,7 +376,7 @@ replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. This function cannot fail, but may hang the thread if the :term:`attached thread state` prior to the original :c:func:`PyThreadState_Ensure` - was daemon. + was daemon and the interpreter was finalized. Deprecation of ``PyGILState`` APIs ---------------------------------- From 7e65b8a381a0f99fbaba8d0b83f8d04ff44a2f15 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 16:14:54 -0400 Subject: [PATCH 26/51] Clarify heading. --- peps/pep-0788.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index b2b8243a726..73b24e08167 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -169,8 +169,8 @@ that could, in theory, be fixed in CPython, but it's definitely worth noting here. Incidentally, acceptance and implementation of this PEP will likely fix the existing crashes caused by :c:func:`PyGILState_Ensure`. -"GIL" is tricky for free-threading -********************************** +The term "GIL" is tricky for free-threading +******************************************* A large issue with the term "GIL" in the C API is that it's semantically misleading, as noted in `python/cpython#127989 From 4fc4957d5096e25499134a37155b25b48ee65f6b Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 16:17:49 -0400 Subject: [PATCH 27/51] Add a new paragraph. --- peps/pep-0788.rst | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 73b24e08167..565db13c748 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -334,10 +334,11 @@ remain daemon by default. .. c:function:: int PyThreadState_SetDaemon(int is_daemon) - Set the :term:`attached thread state` as non-daemon or daemon. The attached - thread state must not be the main thread for the interpreter. All thread - states created without :c:func:`PyThreadState_Ensure` are daemon by - default. + Set the :term:`attached thread state` as non-daemon or daemon. + + The attached thread state must not be the main thread for the + interpreter. All thread states created without + :c:func:`PyThreadState_Ensure` are daemon by default. If the thread state is non-daemon, then the current interpreter will wait for this thread to finish before shutting down. See also From e9290c219317b322eb58d1145fc4f5771f3e15b5 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 16:33:23 -0400 Subject: [PATCH 28/51] Move PyStatus to rejected ideas. --- peps/pep-0788.rst | 96 +++++++++++++++++++++++++---------------------- 1 file changed, 51 insertions(+), 45 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index baca4babf70..476d82e6641 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -234,26 +234,15 @@ Rationale This PEP includes several new APIs that intend to fix all of the issues stated above. -Bikeshedding and the ``PyThreadState`` namespace ------------------------------------------------- - -To solve the issue with "GIL" terminology, the new functions described by this -PEP intended as replacements for ``PyGILState`` will go under the existing -``PyThreadState`` namespace. In Python 3.14, the documentation has been -updated to switch over to terms like -:term:`"attached thread state" ` instead of -:term:`"global interpreter lock" `, so this namespace -seems to fit well for the functions in this PEP. - -Full deprecation of old APIs ----------------------------- +Replacing the old APIs +---------------------- As made clear in Motivation_, ``PyGILState`` is already pretty buggy, and even if it was magically fixed, the current behavior of hanging the thread is -beyond repair. As such, this PEP intends to completely deprecate the existing -``PyGILState`` APIs. However, even if this PEP is rejected, all of the APIs -can be replaced with more correct ``PyThreadState`` functions in the current -C API: +beyond repair. In turn, this PEP intends to completely deprecate the existing +``PyGILState`` APIs and provide better alternatives. However, even if this PEP +is rejected, all of the APIs can be replaced with more correct ``PyThreadState`` +functions in the current C API: - :c:func:`PyGILState_Ensure`: :c:func:`PyThreadState_Swap` / :c:func:`PyThreadState_New` - :c:func:`PyGILState_Release`: :c:func:`PyThreadState_Clear` / :c:func:`PyThreadState_Delete` @@ -263,12 +252,23 @@ C API: A light layer of magic ---------------------- -The APIs proposed by this PEP intentionally have a layer of "magic" that is -kept from the user and offloads complexity onto CPython maintainers. This is -done primarily to help ease the transition from ``PyGILState`` for existing +The APIs proposed by this PEP intentionally have a layer of abstraction that is +hidden from the user and offloads complexity onto CPython. This is done +primarily to help ease the transition from ``PyGILState`` for existing codebases, and for ease-of-use to those who provide wrappers the C API, such as Cython or PyO3. See also :ref:`pep-788-activate-deactivate-instead`. +Bikeshedding and the ``PyThreadState`` namespace +------------------------------------------------ + +To solve the issue with "GIL" terminology, the new functions described by this +PEP intended as replacements for ``PyGILState`` will go under the existing +``PyThreadState`` namespace. In Python 3.14, the documentation has been +updated to switch over to terms like +:term:`"attached thread state" ` instead of +:term:`"global interpreter lock" `, so this namespace +seems to fit well for this PEP. + Specification ============= @@ -297,12 +297,14 @@ prevent the interpreter from finalizing. This function is generally meant to be used in tandem with :c:func:`PyThreadState_Ensure`. - The caller must have an :term:`attached thread state`. + The caller must have an :term:`attached thread state`, and cannot return + ``NULL``. Failures are always a fatal error. .. c:function:: void PyInterpreterState_Release(PyInterpreterState *interp) - Decrement the reference count of the interpreter. + Decrement the reference count of the interpreter, as was incremented by + :c:func:`PyInterpreterState_Hold`. This function cannot fail, other than with a fatal error. The caller must have an :term:`attached thread state` for *interp*. @@ -352,18 +354,20 @@ replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. .. c:function:: int PyThreadState_Ensure(PyInterpreterState *interp) - Ensure that the thread has an :term:`attached thread state` for *interp*, and - thus can safely invoke that interpreter. It is OK to call this function if - the thread already has an attached thread state, as long as there is a - subsequent call to :c:func:`PyThreadState_Release` that matches this one. + Ensure that the thread has an :term:`attached thread state` for *interp*, + and thus can safely invoke that interpreter. It is OK to call this + function if the thread already has an attached thread state, as long as + there is a subsequent call to :c:func:`PyThreadState_Release` that matches + this one. - The interpreter's *interp* reference count is decremented by one. As such, *interp* - should have been acquired by :c:func:`PyInterpreterState_Hold`. + The interpreter's *interp* reference count is decremented by one. + As such, *interp* should have been acquired by + :c:func:`PyInterpreterState_Hold`. Thread states created by this function are non-daemon by default. See :c:func:`PyThreadState_SetDaemon`. If the calling thread already has an - :term:`attached thread state` that matches *interp*, then this function will - simply mark the existing thread state as non-daemon and return. It will + :term:`attached thread state` that matches *interp*, then this function + will mark the existing thread state as non-daemon and return. It will be restored to its prior daemon status upon the next :c:func:`PyThreadState_Release` call. @@ -594,24 +598,26 @@ This was ultimately rejected for two reasons: for code-generators like Cython to use, as there isn't any additional complexity with tracking :c:type:`PyThreadState` pointers around. -Open Issues -=========== +Using ``PyStatus`` for the return value of :c:func:`PyThreadState_Ensure` +------------------------------------------------------------------------- -Use ``PyStatus`` for the return value of :c:func:`PyThreadState_Ensure`? ------------------------------------------------------------------------- +In prior iterations of this API, :c:func:`PyThreadState_Ensure` returned a +:c:type:`PyStatus` instead of an integer to denote failures, which had the +benefit of providing an error message. -:c:func:`PyThreadState_Ensure` returns an integer to return failures, but some -iterations have suggested the use of :c:type:`PyStatus` to denote failure, -which has the benefit of providing an error message. The main hesitation for -switching to ``PyStatus`` is that it's more difficult to use, as the -``PyStatus`` has to be stored and checked, whereas a simple integer can simply -be used inline with an ``if`` clause. - -Additionally, it's -`not clear `_ +This was rejected because it's `not clear `_ that an error message would be all that useful; all the conceived use-cases -for this API wouldn't really care about a message indicating why Python can't -be invoked. +for this API wouldn't really care about a message indicating why Python +can't be invoked. As such, the API would only be needlessly harder to use, +which in turn would hurt the transition from :c:func:`PyGILState_Ensure`. + +In addition, :c:type:`PyStatus` isn't commonly used in the C API. A few +functions related to interpreter initialization use it (simply because they +can't raise exceptions), and :c:func:`PyThreadState_Ensure` does not fall +under that category. + +Open Issues +=========== When should the legacy APIs be removed? --------------------------------------- From 4869af820f64c4b058685473191d977bc4642621 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 19:29:35 -0400 Subject: [PATCH 29/51] Fix Sphinx cross-references. --- peps/pep-0788.rst | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 476d82e6641..3d034248088 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -194,7 +194,7 @@ do so. Subinterpreters don't work with :c:func:`PyGILState_Ensure` ----------------------------------------------------------- -As noted in the `documentation `_, +As noted in the :ref:`documentation `, ``PyGILState`` APIs aren't officially supported in subinterpreters: Note that the ``PyGILState_*`` functions assume there is only one global @@ -415,13 +415,10 @@ How to Teach This ================= As with all C API functions, all the new APIs in this PEP will be documented -in the C API documentation, ideally under the -`Non-Python created threads `_ section. -The existing `High-level API `_ section, containing most -of the ``PyGILState`` documentation, should be updated accordingly to point +in the C API documentation, ideally under the :ref:`python:gilstate` section. +The existing ``PyGILState`` documentation should be updated accordingly to point to the new APIs. - Examples -------- From 734a6c328e357d79c94b848f144407f9c9b452a4 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 19:33:49 -0400 Subject: [PATCH 30/51] Fix Americanism. --- peps/pep-0788.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 3d034248088..75b41d3934e 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -22,7 +22,7 @@ become problematic: - They aren't safe for finalization, either causing the calling thread to hang or crashing it with a segmentation fault, preventing further execution. - When they're called before finalization, they force the thread to be - "daemon," meaning that the interpreter won't wait for it to reach any point + "daemon", meaning that the interpreter won't wait for it to reach any point of execution. This is mostly frustrating for developers, but can lead to deadlocks! - Subinterpreters don't play nicely with them, because they all assume that @@ -67,7 +67,7 @@ For example, a callback that wants to call Python code might be invoked when: thread local storage. In the current C API, any non-Python thread (one not created via the -:mod:`threading` module) is considered to be "daemon," meaning that the interpreter +:mod:`threading` module) is considered to be "daemon", meaning that the interpreter won't wait on that thread to finalize. Instead, the interpreter will hang the thread when it goes to :term:`attach ` a :term:`thread state`, making it unusable past that point. Attaching a thread state can happen at From 6155ed90d31c7c0bdbe768160adca197192f5025 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Thu, 24 Apr 2025 21:13:27 -0400 Subject: [PATCH 31/51] Add a link to the reference implementation. --- peps/pep-0788.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 75b41d3934e..13a02fe5aa7 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -538,7 +538,8 @@ This is the same code, updated to use the new functions: Reference Implementation ======================== -TBD. +A reference implementation of this PEP can be found +`here `_. Rejected Ideas ============== From 91fc9f537c6835e3a1886b335fe8801ff6cd4940 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 08:32:07 -0400 Subject: [PATCH 32/51] Apply suggestions from code review Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> --- peps/pep-0788.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 13a02fe5aa7..9f9d3e73e78 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -26,7 +26,7 @@ become problematic: of execution. This is mostly frustrating for developers, but can lead to deadlocks! - Subinterpreters don't play nicely with them, because they all assume that - the main interpreter is the only one that exists. A fresh thread (*i.e.*, + the main interpreter is the only one that exists. A fresh thread (that is, has never had a thread state) that calls :c:func:`PyGILState_Ensure` will always be for the main interpreter. - The term "GIL" in the name is quite confusing for users of free-threaded @@ -37,7 +37,7 @@ This PEP intends to fix all of these issues by providing two new functions, correct and safer replacement for :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. For example: -.. code-block:: C +.. code-block:: c if (PyThreadState_Ensure(interp) < 0) { fputs("Python is shutting down", stderr); @@ -89,7 +89,7 @@ the thread: Calling this function from a thread when the runtime is finalizing will terminate the thread, even if the thread was not created by Python. You - can use Py_IsFinalizing() or sys.is_finalizing() to check if the + can use ``Py_IsFinalizing()`` or ``sys.is_finalizing()`` to check if the interpreter is in process of being finalized before calling this function to avoid unwanted termination. @@ -112,7 +112,7 @@ So, all code that needs to work with locks need to detach the thread state. In C, this is almost always done via :c:macro:`Py_BEGIN_ALLOW_THREADS` and :c:macro:`Py_END_ALLOW_THREADS`, in a code block that looks something like this: -.. code-block:: C +.. code-block:: c Py_BEGIN_ALLOW_THREADS acquire_lock(); @@ -126,7 +126,7 @@ thread will continue to run finalizers past that point, though. If any of those finalizers try to acquire the lock, deadlock ensues. This affects the Python core itself, and there's not much that can be done -to fix it. For example, `gh-129536 `_ +to fix it. For example, `python/cpython#129536 `_ remarks that the :mod:`ssl` module can emit a fatal error when used at finalization, because a daemon thread got hung while holding the lock. There are workarounds for this for pure-Python code, but native threads don't have @@ -434,7 +434,7 @@ If this were to be called from a daemon thread, then the interpreter could hang the thread while reattaching the thread state, leaving us with the lock held. Any future finalizer that wanted to acquire the lock would be deadlocked! -.. code-block:: C +.. code-block:: c static PyObject * my_critical_operation(PyObject *self, PyObject *unused) @@ -466,7 +466,7 @@ Transitioning from old functions The following code uses the old ``PyGILState`` APIs: -.. code-block:: C +.. code-block:: c static int thread_func(void *arg) @@ -500,7 +500,7 @@ The following code uses the old ``PyGILState`` APIs: This is the same code, updated to use the new functions: -.. code-block:: C +.. code-block:: c static int thread_func(void *arg) From 8f3dbb479ca862adee484f3aa682c39f27560cee Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 12:42:07 +0000 Subject: [PATCH 33/51] Use bullets. --- peps/pep-0788.rst | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 9f9d3e73e78..a0c8bcd411f 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -553,18 +553,19 @@ deleted and cause use-after-free violations. :c:func:`PyInterpreterState_Hold` fixes this issue anyway, but an interpreter ID does have the benefit of requiring less magic in the implementation, but has several downsides: -1. Nearly all existing APIs already return a :c:type:`PyInterpreterState` +- Nearly all existing APIs already return a :c:type:`PyInterpreterState` pointer, not an interpreter ID. Functions like :c:func:`PyThreadState_GetInterpreter` would have to be accompanied by frustrating calls to :c:func:`PyInterpreterState_GetID`. There's also no existing way to go from an ``int64_t`` back to a :c:expr:`PyInterpreterState *`, and providing such an API would come with its own set of design problems. -2. Threads typically take a ``void *arg`` parameter, not an ``int64_t arg``. +- Threads typically take a ``void *arg`` parameter, not an ``int64_t arg``. As such, passing an interpreter pointer requires much less boilerplate for the user, because an additional structure definition or heap allocation - would be needed to store the interpreter ID. -3. To retain usability, interpreter ID APIs would still need to keep a + would be needed to store the interpreter ID. This is especially an issue + on 32-bit systems, where ``void *`` is too small for an ``int64_t``. +- To retain usability, interpreter ID APIs would still need to keep a reference count, otherwise the interpreter could be finalizing before the native thread gets a chance to attach. The problem with using an interpreter ID is that the reference count has to be "invisible"; it @@ -589,10 +590,10 @@ make the ownership and lifetime of the thread state clearer: This was ultimately rejected for two reasons: -1. The proposed API has closer usage to +- The proposed API has closer usage to :c:func:`PyGILState_Ensure` / :c:func:`PyGILState_Release`, which helps ease the transition for old codebases. -2. It's `significantly easier `_ +- It's `significantly easier `_ for code-generators like Cython to use, as there isn't any additional complexity with tracking :c:type:`PyThreadState` pointers around. From 5abdb417478d273dbae5ca8d972e7e0c1c642c81 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 12:43:47 +0000 Subject: [PATCH 34/51] Add note about hanging the thread being new. --- peps/pep-0788.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index a0c8bcd411f..675c1f953a6 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -73,7 +73,8 @@ thread when it goes to :term:`attach ` a :term:`thread st making it unusable past that point. Attaching a thread state can happen at any point when invoking Python, such as releasing the GIL in-between bytecode instructions, or when a C function exits a :c:macro:`Py_BEGIN_ALLOW_THREADS` -block. +block. (Note that hanging the thread is relatively new behavior; in prior +versions, the thread would terminate, but the issue is the same.) This means that any non-Python thread may be terminated at any point, which is severely limiting for users who want to do more than just execute Python From 9f9eb4cadd5618043dfdb317bb1477201f851aad Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 20:44:43 -0400 Subject: [PATCH 35/51] Clarify some things. --- peps/pep-0788.rst | 90 +++++++++++++++++++++++++++-------------------- 1 file changed, 51 insertions(+), 39 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 675c1f953a6..40409aed261 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -270,46 +270,23 @@ updated to switch over to terms like :term:`"global interpreter lock" `, so this namespace seems to fit well for this PEP. -Specification -============= - -Interpreter reference counting ------------------------------- - -Internally, the interpreter will have to keep track of a reference count -field, which will determine when the interpreter state is actually -deallocated. This is done to prevent use-after-free crashes in -:c:func:`PyThreadState_Ensure` for interpreters with short lifetimes. - -An interpreter state returned by :c:func:`Py_NewInterpreter` (or really, -:c:func:`PyInterpreterState_New`) will start with a reference count of 1, and -:c:func:`PyInterpreterState_Delete` will decrement the reference count. If the -new reference count is zero, :c:func:`PyInterpreterState_Delete` will -deallocate the interpreter state. However, the reference count will *not* -prevent the interpreter from finalizing. - -.. c:function:: PyInterpreterState *PyInterpreterState_Hold(void) - - Similar to :c:func:`PyInterpreterState_Get`, but returns a strong - reference to the interpreter (meaning, it has its reference count - incremented by one, allowing the returned interpreter state to be safely - accessed by another thread). - - This function is generally meant to be used in tandem with - :c:func:`PyThreadState_Ensure`. - - The caller must have an :term:`attached thread state`, and cannot return - ``NULL``. Failures are always a fatal error. - - -.. c:function:: void PyInterpreterState_Release(PyInterpreterState *interp) +Preventing interpreter finalization with references +--------------------------------------------------- - Decrement the reference count of the interpreter, as was incremented by - :c:func:`PyInterpreterState_Hold`. +Several iterations of this API have taken an approach where +:c:func:`PyThreadState_Ensure` can return a failure based on the state of +the interpreter. Instead, this PEP takes an approach where an interpreter +keeps track of the number of non-daemon threads, which inherently prevents +it from beginning finalization. - This function cannot fail, other than with a fatal error. The caller must - have an :term:`attached thread state` for *interp*. +The main upside with this approach is that there's more consistency with +attaching threads. Using an interpreter reference from the calling thread +keeps the interpreter from finalizing before the thread starts, ensuring +that it always works. An approach that were to return a failure based on +the start-time of the thread could cause spurious issues. +Specification +============= Daemon and non-daemon threads ----------------------------- @@ -344,8 +321,43 @@ remain daemon by default. :attr:`threading.Thread.daemon`. Return zero on success, non-zero *without* an exception set on failure. - Failure generally means that threads have already finalized for the - current interpreter. + +Interpreter reference counting +------------------------------ + +Internally, the interpreter will have to keep track of the number of +non-daemon native threads, which will determine when the interpreter can +finalize. This is done to prevent use-after-free crashes in +:c:func:`PyThreadState_Ensure` for interpreters with short lifetimes, and +to remove needless layers of synchronization between the calling thread and +the started thread. + +An interpreter state returned by :c:func:`Py_NewInterpreter` (or really, +:c:func:`PyInterpreterState_New`) will start with a native thread countdown. +For simplicity's sake, this will be referred to as a reference count. +A non-zero reference count prevents the interpreter from finalizing. + +.. c:function:: PyInterpreterState *PyInterpreterState_Hold(void) + + Similar to :c:func:`PyInterpreterState_Get`, but returns a strong + reference to the interpreter (meaning, it has its reference count + incremented by one, allowing the returned interpreter state to be safely + accessed by another thread, because it will be prevented from finalizing). + + This function is generally meant to be used in tandem with + :c:func:`PyThreadState_Ensure`. + + The caller must have an :term:`attached thread state`, and cannot return + ``NULL``. Failures are always a fatal error. + + +.. c:function:: void PyInterpreterState_Release(PyInterpreterState *interp) + + Decrement the reference count of the interpreter, as was incremented by + :c:func:`PyInterpreterState_Hold`. + + This function cannot fail, other than with a fatal error. The caller must + have an :term:`attached thread state` for *interp*. Ensuring and releasing thread states ------------------------------------ From 0e897192cc8ee11782cabf1cf5721210f7ead6fa Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 21:14:47 -0400 Subject: [PATCH 36/51] Add an example for daemon threads. --- peps/pep-0788.rst | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 40409aed261..bbe065cc1b6 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -548,6 +548,45 @@ This is the same code, updated to use the new functions: } +Daemon thread example +********************* + +Native daemon threads are still a use-case, and as such, +they can still be used with this API: + +.. code-block:: c + + static int + thread_func(void *arg) + { + PyInterpreterState *interp = (PyInterpreterState *)arg; + if (PyThreadState_Ensure(interp) < 0) { + fputs("Cannot talk to Python", stderr); + return -1; + } + (void)PyThreadState_SetDaemon(1); + if (PyRun_SimpleString("print(42)") < 0) { + PyErr_Print(); + } + PyThreadState_Release(); + return 0; + } + + static PyObject * + my_method(PyObject *self, PyObject *unused) + { + PyThread_handle_t handle; + PyThead_indent_t indent; + + PyInterpreterState *interp = PyInterpreterState_Hold(); + if (PyThread_start_joinable_thread(thread_func, interp, &ident, &handle) < 0) { + PyInterpreterState_Release(interp); + return NULL; + } + Py_RETURN_NONE; + } + + Reference Implementation ======================== From 8f77194c8e02de2193d361fa2b9b273bd7d7182c Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 21:37:46 -0400 Subject: [PATCH 37/51] Apply suggestions from code review Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> --- peps/pep-0788.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index bbe065cc1b6..5687f107336 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -96,7 +96,7 @@ the thread: Unfortunately, this isn't correct, because of time-of-call to time-of-use issues; the interpreter might not be finalizing during the call to -:c:func:`Py_IsFinalizing`, but it might start finalizing right after, which +:c:func:`Py_IsFinalizing`, but it might start finalizing immediately afterwards, which would cause the attachment of a thread state (typically via :c:func:`PyGILState_Ensure`) to hang the thread. @@ -155,14 +155,14 @@ The existing APIs are broken and misleading ------------------------------------------- There are currently two public ways for a user to create and attach their own -:term:`thread state`; manual use of :c:func:`PyThreadState_New` / :c:func:`PyThreadState_Swap`, -and :c:func:`PyGILState_Ensure`. The former, :c:func:`PyGILState_Ensure`, +:term:`thread state`; manual use of :c:func:`PyThreadState_New` & :c:func:`PyThreadState_Swap`, +and :c:func:`PyGILState_Ensure`. The latter, :c:func:`PyGILState_Ensure`, is `significantly more common `_. :c:func:`PyGILState_Ensure` generally crashes during finalization ***************************************************************** -As of this PEP, the current behavior of :c:func:`PyGILState_Ensure` does not +At the time of writing, the current behavior of :c:func:`PyGILState_Ensure` does not match the documentation. Instead of hanging the thread during finalization as previously noted, it's extremely common for it to crash with a segmentation fault. This is a `known issue `_ From e7887b7420cbd4c902a64812b190473837e956e8 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 21:38:32 -0400 Subject: [PATCH 38/51] Use a reference for the Py_IsFinalizing() section. --- peps/pep-0788.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 5687f107336..8e47b7ab355 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -84,7 +84,7 @@ code in their stream of calls (for example, C++ executing finalizers in Using :c:func:`Py_IsFinalizing` is insufficient *********************************************** -The :c:func:`docs ` +The :ref:`docs ` currently recommend :c:func:`Py_IsFinalizing` to guard against termination of the thread: From 569ce3f6d201deb9d68a4f32ae761c91e8e3715d Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 21:40:08 -0400 Subject: [PATCH 39/51] Improve phrasing for 'hanging'. --- peps/pep-0788.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 8e47b7ab355..07eaa60eedc 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -121,10 +121,10 @@ In C, this is almost always done via :c:macro:`Py_BEGIN_ALLOW_THREADS` and Again, in a daemon thread, :c:macro:`Py_END_ALLOW_THREADS` will hang the thread if the interpreter is finalizing. But, :c:macro:`Py_BEGIN_ALLOW_THREADS` will -*not* hang the thread; the lock will be acquired, and *then* hung! Once that -happens, nothing can try to acquire that lock without deadlocking. The main -thread will continue to run finalizers past that point, though. If any of -those finalizers try to acquire the lock, deadlock ensues. +*not* hang the thread; the lock will be acquired, and *then* the thread will +be hung! Once that happens, nothing can try to acquire that lock without +deadlocking. The main thread will continue to run finalizers past that point, +though. If any of those finalizers try to acquire the lock, deadlock ensues. This affects the Python core itself, and there's not much that can be done to fix it. For example, `python/cpython#129536 `_ From 42b37d8b30d3458585610559861625f3eca1cc55 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 21:41:23 -0400 Subject: [PATCH 40/51] 'Python core' -> 'CPython' --- peps/pep-0788.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 07eaa60eedc..3a9839ccaf2 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -126,7 +126,7 @@ be hung! Once that happens, nothing can try to acquire that lock without deadlocking. The main thread will continue to run finalizers past that point, though. If any of those finalizers try to acquire the lock, deadlock ensues. -This affects the Python core itself, and there's not much that can be done +This affects CPython itself, and there's not much that can be done to fix it. For example, `python/cpython#129536 `_ remarks that the :mod:`ssl` module can emit a fatal error when used at finalization, because a daemon thread got hung while holding the lock. There From b7a60814b3aa61cc6795847a234e4f7f43da2e20 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 21:44:04 -0400 Subject: [PATCH 41/51] Clarify the abstraction. --- peps/pep-0788.rst | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 3a9839ccaf2..8e674a20788 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -257,7 +257,12 @@ The APIs proposed by this PEP intentionally have a layer of abstraction that is hidden from the user and offloads complexity onto CPython. This is done primarily to help ease the transition from ``PyGILState`` for existing codebases, and for ease-of-use to those who provide wrappers the C API, such -as Cython or PyO3. See also :ref:`pep-788-activate-deactivate-instead`. +as Cython or PyO3. + +In particular, the API hides details about the lifetime of the thread state +and most of the details with interpreter references. + +See also :ref:`pep-788-activate-deactivate-instead`. Bikeshedding and the ``PyThreadState`` namespace ------------------------------------------------ From b918224f2fa705f2a8219343f9735ca71cfb4886 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 21:54:14 -0400 Subject: [PATCH 42/51] Clarify the deprecation. --- peps/pep-0788.rst | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 8e674a20788..3f7a19886eb 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -250,6 +250,12 @@ functions in the current C API: - :c:func:`PyGILState_GetThisThreadState`: :c:func:`PyThreadState_Get` - :c:func:`PyGILState_Check`: ``PyThreadState_GetUnchecked() != NULL`` +This PEP specifies a ten-year deprecation for these functions (while remaining +in the stable ABI), primarily because it's expected that the migration won't be +seamless, due to the new requirement of storing an interpreter state. The +exact details of this deprecation are currently unclear, see +:ref:`pep-788-deprecation`. + A light layer of magic ---------------------- @@ -675,6 +681,8 @@ under that category. Open Issues =========== +.. _pep-788-deprecation: + When should the legacy APIs be removed? --------------------------------------- @@ -684,6 +692,11 @@ Currently, the plan is to remove them in 10 years (opposed to the 5 years required by :pep:`387`), but this is subject to further discussion, as it's unclear if that's enough (or too much) time. +In addition, it's unclear whether to remove them at all. A +:term:`soft deprecation ` could reasonably fit for these +functions if it's determined that a full ``PyGILState`` removal would +be too disruptive for the ecosystem. + Copyright ========= From 4e0c0fdad59e63ddfe12ff23e4392df3b40b5519 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 22:19:21 -0400 Subject: [PATCH 43/51] Add clarification for 'modern versions' --- peps/pep-0788.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 3f7a19886eb..42b6ea3951c 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -183,14 +183,14 @@ created by the authors of this PEP: omit ``PyGILState_Ensure`` in fresh threads. In reality, :c:func:`PyGILState_Ensure` doesn't just "acquire the GIL" in -modern versions. It attaches a :term:`thread state` for the current -thread--*that's* what lets a thread invoke the C API. On with-GIL builds, -holding an :term:`attached thread state` implies holding the GIL, so only one -thread can have one at a time. Free-threaded builds achieve the effect of -multi-core parallism while remaining backwards-compatible by simply removing -that limitation: threads still need a thread state (and thus need to call -:c:func:`PyGILState_Ensure`), but they don't need to wait on one another to -do so. +modern versions (that being since 3.12). It attaches a :term:`thread state` +for the current thread--*that's* what lets a thread invoke the C API. On +with-GIL builds, holding an :term:`attached thread state` implies holding the +GIL, so only one thread can have one at a time. Free-threaded builds achieve +the effect of multi-core parallism while remaining backwards-compatible by +simply removing that limitation: threads still need a thread state (and thus +need to call :c:func:`PyGILState_Ensure`), but they don't need to wait on one +another to do so. Subinterpreters don't work with :c:func:`PyGILState_Ensure` ----------------------------------------------------------- From 32c2c794a90186b94ae305d3d21370b6a498a6b0 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Fri, 25 Apr 2025 23:53:11 -0400 Subject: [PATCH 44/51] Reword the GIL term section a bit. --- peps/pep-0788.rst | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 42b6ea3951c..5b61b676022 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -173,8 +173,8 @@ the existing crashes caused by :c:func:`PyGILState_Ensure`. The term "GIL" is tricky for free-threading ******************************************* -A large issue with the term "GIL" in the C API is that it's semantically -misleading, as noted in `python/cpython#127989 +A large issue with the term "GIL" in the C API is that it is semantically +misleading. This was noted in `python/cpython#127989 `_, created by the authors of this PEP: @@ -182,15 +182,13 @@ created by the authors of this PEP: erroneously call the C API inside ``Py_BEGIN_ALLOW_THREADS`` blocks or omit ``PyGILState_Ensure`` in fresh threads. -In reality, :c:func:`PyGILState_Ensure` doesn't just "acquire the GIL" in -modern versions (that being since 3.12). It attaches a :term:`thread state` -for the current thread--*that's* what lets a thread invoke the C API. On -with-GIL builds, holding an :term:`attached thread state` implies holding the -GIL, so only one thread can have one at a time. Free-threaded builds achieve -the effect of multi-core parallism while remaining backwards-compatible by -simply removing that limitation: threads still need a thread state (and thus -need to call :c:func:`PyGILState_Ensure`), but they don't need to wait on one -another to do so. +Since Python 3.12, it is an :term:`attached thread state` that lets a thread +invoke the C API. On with-GIL builds, holding an :term:`attached thread state` +implies holding the GIL, so only one thread can have one at a time. Free-threaded +builds achieve the effect of multi-core parallism while remaining +ackwards-compatible by simply removing that limitation: threads still need a +thread state (and thus need to call :c:func:`PyGILState_Ensure`), but they +don't need to wait on one another to do so. Subinterpreters don't work with :c:func:`PyGILState_Ensure` ----------------------------------------------------------- From 3274da7a866ed8198d514a652d26807f8e70dbab Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Sat, 26 Apr 2025 00:02:02 -0400 Subject: [PATCH 45/51] Improve the abstract. --- peps/pep-0788.rst | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 5b61b676022..b5ed520561e 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -49,6 +49,40 @@ correct and safer replacement for :c:func:`PyGILState_Ensure` and PyThreadState_Release(); +This is achieved by introducing two concepts into the C API: + +- "Daemon" and "non-daemon" threads, similar to how it works in the + :mod:`threading` module. +- Interpreter reference counts which prevent the interpreter from finalizing. + +In :c:func:`PyThreadState_Ensure`, both of these ideas are applied. The +calling thread is to store a reference to the interpreter via +:c:func:`PyInterpreterState_Hold`. :c:func:`PyInterpreterState_Hold` +increases the reference count of the interpreter, requiring the thread +to finish (by eventually calling :c:func:`PyThreadState_Release`) before +beginning finalization. + +As such, creating a native thread with this API would look something +like this: + +.. code-block:: c + + static PyObject * + my_method(PyObject *self, PyObject *unused) + { + PyThread_handle_t handle; + PyThead_indent_t indent; + + PyInterpreterState *interp = PyInterpreterState_Hold(); + if (PyThread_start_joinable_thread(thread_func, interp, &ident, &handle) < 0) { + PyInterpreterState_Release(interp); + return NULL; + } + /* The thread will always attach and finish, because we increased + the reference count of the interpreter. */ + Py_RETURN_NONE; + } + Motivation ========== From a8e6672f25363280612f6307b914cba0fa6b654c Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Sat, 26 Apr 2025 00:03:33 -0400 Subject: [PATCH 46/51] Silly pre-commit --- peps/pep-0788.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index b5ed520561e..51ed8fe28db 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -51,7 +51,7 @@ correct and safer replacement for :c:func:`PyGILState_Ensure` and This is achieved by introducing two concepts into the C API: -- "Daemon" and "non-daemon" threads, similar to how it works in the +- "Daemon" and "non-daemon" threads, similar to how it works in the :mod:`threading` module. - Interpreter reference counts which prevent the interpreter from finalizing. @@ -219,7 +219,7 @@ created by the authors of this PEP: Since Python 3.12, it is an :term:`attached thread state` that lets a thread invoke the C API. On with-GIL builds, holding an :term:`attached thread state` implies holding the GIL, so only one thread can have one at a time. Free-threaded -builds achieve the effect of multi-core parallism while remaining +builds achieve the effect of multi-core parallism while remaining ackwards-compatible by simply removing that limitation: threads still need a thread state (and thus need to call :c:func:`PyGILState_Ensure`), but they don't need to wait on one another to do so. From 1861753ffd8e9b269fd17815ae419a1b05e4ecd9 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Sat, 26 Apr 2025 00:11:50 -0400 Subject: [PATCH 47/51] Tiny wording nit. --- peps/pep-0788.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 51ed8fe28db..bad43ccdb29 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -62,7 +62,7 @@ increases the reference count of the interpreter, requiring the thread to finish (by eventually calling :c:func:`PyThreadState_Release`) before beginning finalization. -As such, creating a native thread with this API would look something +For example, creating a native thread with this API would look something like this: .. code-block:: c From b165f302e946f39b96b8af74423454739236bc6c Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Sat, 26 Apr 2025 11:41:51 -0400 Subject: [PATCH 48/51] Add PyInterpreterState_Lookup() --- peps/pep-0788.rst | 95 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 88 insertions(+), 7 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index bad43ccdb29..c724deec623 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -328,6 +328,11 @@ keeps the interpreter from finalizing before the thread starts, ensuring that it always works. An approach that were to return a failure based on the start-time of the thread could cause spurious issues. +In the case where it is useful to let the interpreter finalize, such as in +a signal handler where there's no guarantee that the thread will start, +strong references to an interpreter can be acquired through +:c:func:`PyInterpreterState_Lookup`. + Specification ============= @@ -393,14 +398,30 @@ A non-zero reference count prevents the interpreter from finalizing. The caller must have an :term:`attached thread state`, and cannot return ``NULL``. Failures are always a fatal error. +.. c:function:: PyInterpreterState *PyInterpreterState_Lookup(int64_t interp_id) + + Similar to :c:func:`PyInterpreterState_Hold`, but looks up an interpreter + based on an ID (see :c:func:`PyInterpreterState_GetID`). This has the + benefit of allowing the interpreter to finalize in cases where the thread + might not start, such as inside of a signal handler. + + This function will return ``NULL`` without an exception set on failure. + If the return value is non-``NULL``, then the returned interpreter will be + prevented from finalizing until the reference is released by + :c:func:`PyThreadState_Release` or :c:func:`PyInterpreterState_Release`. + + Returning ``NULL`` typically means that the interpreter is at a point + where threads cannot start, or no longer exists. + + The caller does not need to have an :term:`attached thread state`. .. c:function:: void PyInterpreterState_Release(PyInterpreterState *interp) Decrement the reference count of the interpreter, as was incremented by - :c:func:`PyInterpreterState_Hold`. + :c:func:`PyInterpreterState_Hold` or :c:func:`PyInterpreterState_Lookup`. - This function cannot fail, other than with a fatal error. The caller must - have an :term:`attached thread state` for *interp*. + This function cannot fail, other than with a fatal error. The caller does + not need to have an :term:`attached thread state` for *interp*. Ensuring and releasing thread states ------------------------------------ @@ -443,7 +464,7 @@ Deprecation of ``PyGILState`` APIs ---------------------------------- This PEP deprecates all of the existing ``PyGILState`` APIs in favor of the -new ``PyThreadState`` APIs for the reasons given in the motivation. Namely: +new ``PyThreadState`` APIs for the reasons given in the Motivation_. Namely: - :c:func:`PyGILState_Ensure`: use :c:func:`PyThreadState_Ensure` instead. - :c:func:`PyGILState_Release`: use :c:func:`PyThreadState_Release` instead. @@ -629,6 +650,64 @@ they can still be used with this API: Py_RETURN_NONE; } +Asynchronous callback example +***************************** + +As started in the Motivation_, there are many cases where it's desirable +to call Python in an asynchronous callback, such as a signal handler. In that +case, it's not safe to call :c:func:`PyInterpreterState_Hold`, because it's +not guaranteed that :c:func:`PyThreadState_Ensure` will ever be called, which +would deadlock finalization. + +This scenario requires :c:func:`PyInterpreterState_Lookup`, which only prevents +finalization when the lookup has been made. + +For example: + +.. code-block:: c + + typedef struct { + int64_t interp_id; + } pyrun_t; + + static int + async_callback(void *arg) + { + pyrun_t *data = (pyrun_t *)arg; + PyInterpreterState *interp = PyInterpreterState_Lookup(data->interp_id); + PyMem_RawFree(data); + if (interp == NULL) { + fputs("Python has shut down", stderr); + return -1; + } + if (PyThreadState_Ensure(interp) < 0) { + fputs("Cannot talk to Python", stderr); + return -1; + } + if (PyRun_SimpleString("print(42)") < 0) { + PyErr_Print(); + } + PyThreadState_Release(); + return 0; + } + + static PyObject * + setup_callback(PyObject *self, PyObject *unused) + { + PyThread_handle_t handle; + PyThead_indent_t indent; + + pyrun_t *data = PyMem_RawMalloc(sizeof(pyrun_t)); + if (data == NULL) { + return PyErr_NoMemory(); + } + // Weak reference to the interpreter. It won't wait on the callback + // to finalize. + data->interp_id = PyInterpreterState_GetID(PyInterpreterState_Get()); + register_callback(async_callback, data); + + Py_RETURN_NONE; + } Reference Implementation ======================== @@ -639,8 +718,8 @@ A reference implementation of this PEP can be found Rejected Ideas ============== -Using an interpreter ID instead of a interpreter state ------------------------------------------------------- +Using an interpreter ID instead of a interpreter state for :c:func:`PyThreadState_Ensure` +----------------------------------------------------------------------------------------- Some iterations of this API took an ``int64_t interp_id`` parameter instead of ``PyInterpreterState *interp``, because interpreter IDs cannot be concurrently @@ -667,7 +746,9 @@ requiring less magic in the implementation, but has several downsides: must be tracked elsewhere in the interpreter, likely being *more* complex than :c:func:`PyInterpreterState_Hold`. There's also a lack of intuition that a standalone integer could have such a thing as - a reference count. + a reference count. :c:func:`PyInterpreterState_Lookup` sidesteps this + problem because the reference count is always associated with the returned + interpreter state, not the integer ID. .. _pep-788-activate-deactivate-instead: From 818b89e7bbbad891a1f5cd326e0597c1b299bc07 Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Sat, 26 Apr 2025 11:42:15 -0400 Subject: [PATCH 49/51] Adjust function pairs. --- peps/pep-0788.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index c724deec623..96b664dad55 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -277,8 +277,8 @@ beyond repair. In turn, this PEP intends to completely deprecate the existing is rejected, all of the APIs can be replaced with more correct ``PyThreadState`` functions in the current C API: -- :c:func:`PyGILState_Ensure`: :c:func:`PyThreadState_Swap` / :c:func:`PyThreadState_New` -- :c:func:`PyGILState_Release`: :c:func:`PyThreadState_Clear` / :c:func:`PyThreadState_Delete` +- :c:func:`PyGILState_Ensure`: :c:func:`PyThreadState_Swap` & :c:func:`PyThreadState_New` +- :c:func:`PyGILState_Release`: :c:func:`PyThreadState_Clear` & :c:func:`PyThreadState_Delete` - :c:func:`PyGILState_GetThisThreadState`: :c:func:`PyThreadState_Get` - :c:func:`PyGILState_Check`: ``PyThreadState_GetUnchecked() != NULL`` @@ -767,7 +767,7 @@ make the ownership and lifetime of the thread state clearer: This was ultimately rejected for two reasons: - The proposed API has closer usage to - :c:func:`PyGILState_Ensure` / :c:func:`PyGILState_Release`, which helps + :c:func:`PyGILState_Ensure` & :c:func:`PyGILState_Release`, which helps ease the transition for old codebases. - It's `significantly easier `_ for code-generators like Cython to use, as there isn't any additional From ab11dd3628a9c9d1b601eb5a13a072b108b9d94b Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Sun, 27 Apr 2025 06:53:23 -0400 Subject: [PATCH 50/51] Apply suggestions from code review Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> --- peps/pep-0788.rst | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 96b664dad55..9134f4defcb 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -115,8 +115,8 @@ is severely limiting for users who want to do more than just execute Python code in their stream of calls (for example, C++ executing finalizers in *addition* to calling Python). -Using :c:func:`Py_IsFinalizing` is insufficient -*********************************************** +Using ``Py_IsFinalizing`` is insufficient +***************************************** The :ref:`docs ` currently recommend :c:func:`Py_IsFinalizing` to guard against termination of @@ -167,8 +167,8 @@ finalization, because a daemon thread got hung while holding the lock. There are workarounds for this for pure-Python code, but native threads don't have such an option. -We can't change finalization behavior for :c:func:`PyGILState_Ensure` -********************************************************************* +We can't change finalization behavior for ``PyGILState_Ensure`` +*************************************************************** There will always have to be a point in a Python program where :c:func:`PyGILState_Ensure` can no longer acquire the GIL (or more correctly, @@ -193,8 +193,8 @@ There are currently two public ways for a user to create and attach their own and :c:func:`PyGILState_Ensure`. The latter, :c:func:`PyGILState_Ensure`, is `significantly more common `_. -:c:func:`PyGILState_Ensure` generally crashes during finalization -***************************************************************** +``PyGILState_Ensure`` generally crashes during finalization +*********************************************************** At the time of writing, the current behavior of :c:func:`PyGILState_Ensure` does not match the documentation. Instead of hanging the thread during finalization @@ -217,15 +217,15 @@ created by the authors of this PEP: omit ``PyGILState_Ensure`` in fresh threads. Since Python 3.12, it is an :term:`attached thread state` that lets a thread -invoke the C API. On with-GIL builds, holding an :term:`attached thread state` +invoke the C API. On with-GIL builds, holding an attached thread state implies holding the GIL, so only one thread can have one at a time. Free-threaded builds achieve the effect of multi-core parallism while remaining ackwards-compatible by simply removing that limitation: threads still need a thread state (and thus need to call :c:func:`PyGILState_Ensure`), but they don't need to wait on one another to do so. -Subinterpreters don't work with :c:func:`PyGILState_Ensure` ------------------------------------------------------------ +Subinterpreters don't work with ``PyGILState_Ensure`` +----------------------------------------------------- As noted in the :ref:`documentation `, ``PyGILState`` APIs aren't officially supported in subinterpreters: @@ -443,12 +443,12 @@ replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. Thread states created by this function are non-daemon by default. See :c:func:`PyThreadState_SetDaemon`. If the calling thread already has an - :term:`attached thread state` that matches *interp*, then this function + attached thread state that matches *interp*, then this function will mark the existing thread state as non-daemon and return. It will be restored to its prior daemon status upon the next :c:func:`PyThreadState_Release` call. - Return zero on success, and non-zero with the old :term:`attached thread state` + Return zero on success, and non-zero with the old attached thread state restored (which may have been ``NULL``). .. c:function:: void PyThreadState_Release() @@ -457,7 +457,7 @@ replace :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`. :c:func:`PyThreadState_Ensure`. This function cannot fail, but may hang the thread if the - :term:`attached thread state` prior to the original :c:func:`PyThreadState_Ensure` + attached thread state prior to the original :c:func:`!PyThreadState_Ensure` was daemon and the interpreter was finalized. Deprecation of ``PyGILState`` APIs @@ -653,7 +653,7 @@ they can still be used with this API: Asynchronous callback example ***************************** -As started in the Motivation_, there are many cases where it's desirable +As stated in the Motivation_, there are many cases where it's desirable to call Python in an asynchronous callback, such as a signal handler. In that case, it's not safe to call :c:func:`PyInterpreterState_Hold`, because it's not guaranteed that :c:func:`PyThreadState_Ensure` will ever be called, which @@ -718,8 +718,8 @@ A reference implementation of this PEP can be found Rejected Ideas ============== -Using an interpreter ID instead of a interpreter state for :c:func:`PyThreadState_Ensure` ------------------------------------------------------------------------------------------ +Using an interpreter ID instead of a interpreter state for ``PyThreadState_Ensure`` +----------------------------------------------------------------------------------- Some iterations of this API took an ``int64_t interp_id`` parameter instead of ``PyInterpreterState *interp``, because interpreter IDs cannot be concurrently @@ -773,8 +773,8 @@ This was ultimately rejected for two reasons: for code-generators like Cython to use, as there isn't any additional complexity with tracking :c:type:`PyThreadState` pointers around. -Using ``PyStatus`` for the return value of :c:func:`PyThreadState_Ensure` -------------------------------------------------------------------------- +Using ``PyStatus`` for the return value of ``PyThreadState_Ensure`` +------------------------------------------------------------------- In prior iterations of this API, :c:func:`PyThreadState_Ensure` returned a :c:type:`PyStatus` instead of an integer to denote failures, which had the From dbe158035fad3599ced5cf917475774ce231312e Mon Sep 17 00:00:00 2001 From: Peter Bierma Date: Sun, 27 Apr 2025 06:54:32 -0400 Subject: [PATCH 51/51] PyGILState_Ensure() -> PyGILState_Release() --- peps/pep-0788.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0788.rst b/peps/pep-0788.rst index 9134f4defcb..93ceb04e3c5 100644 --- a/peps/pep-0788.rst +++ b/peps/pep-0788.rst @@ -13,7 +13,7 @@ Post-History: `10-Mar-2025 `__ Abstract ======== -:c:func:`PyGILState_Ensure`, :c:func:`PyGILState_Ensure`, and other related +:c:func:`PyGILState_Ensure`, :c:func:`PyGILState_Release`, and other related functions in the ``PyGILState`` family are the most common way to create native threads that interact with Python. They have been the standard for over twenty years (:pep:`311`). But, over time, these functions have