PEP 768: Add some clarifications for buffer size (#4194)

pablogsal · web-flow · commit c67f3b539273 · 2025-01-09T21:37:30.000Z
diff --git a/peps/pep-0768.rst b/peps/pep-0768.rst
@@ -136,20 +136,19 @@ A new structure is added to PyThreadState to support remote debugging:
 
     typedef struct _remote_debugger_support {
         int debugger_pending_call;
-        char debugger_script[MAX_SCRIPT_SIZE];
+        char debugger_script_path[MAX_SCRIPT_PATH_SIZE];
     } _PyRemoteDebuggerSupport;
 
-
 This structure is appended to ``PyThreadState``, adding only a few fields that
 are **never accessed during normal execution**. The ``debugger_pending_call`` field
 indicates when a debugger has requested execution, while ``debugger_script``
 provides Python code to be executed when the interpreter reaches a safe point.
 
-The value for ``MAX_SCRIPT_SIZE`` will be a trade-off between binary size and
-how big debugging scripts can be. As most of the logic should be in libraries
-and arbitrary code can be executed with very short amount of Python we are
-proposing to start with 4kb initially. This value can be extended in the future
-if we ever need to.
+The value for ``MAX_SCRIPT_PATH_SIZE`` will be a trade-off between binary size
+and how big debugging scripts' paths can be. To limit the memory overhead per
+thread we will be limiting this to 512 bytes. This size will also be provided as
+part of the debugger support structure so debuggers know how much they can
+write. This value can be extended in the future if we ever need to.
 
 
 Debug Offsets Table
@@ -168,18 +167,20 @@ debugger support:
 .. code-block:: C
 
     struct _debugger_support {
-        uint64_t eval_breaker;            // Location of the eval breaker flag
-        uint64_t remote_debugger_support; // Offset to our support structure
-        uint64_t debugger_pending_call;   // Where to write the pending flag
-        uint64_t debugger_script;         // Where to write the script path
+        uint64_t eval_breaker;              // Location of the eval breaker flag
+        uint64_t remote_debugger_support;   // Offset to our support structure
+        uint64_t debugger_pending_call;     // Where to write the pending flag
+        uint64_t debugger_script_path;      // Where to write the script path
+        uint64_t debugger_script_path_size; // Size of the script path buffer
     } debugger_support;
 
 These offsets allow debuggers to locate critical debugging control structures in
 the target process's memory space. The ``eval_breaker`` and ``remote_debugger_support``
 offsets are relative to each ``PyThreadState``, while the ``debugger_pending_call``
 and ``debugger_script`` offsets are relative to each ``_PyRemoteDebuggerSupport``
 structure, allowing the new structure and its fields to be found regardless of
-where they are in memory.
+where they are in memory. ``debugger_script_path_size`` informs the attaching
+tool of the size of the buffer.
 
 Attachment Protocol
 -------------------
@@ -201,7 +202,7 @@ When a debugger wants to attach to a Python process, it follows these steps:
 
    - Write a filename containing Python code to be executed into the
      ``debugger_script`` field in ``_PyRemoteDebuggerSupport``.
-   - Set ``debugger_pending_call`` flag in ``_PyRemoteDebuggerSupport``
+   - Set ``debugger_pending_call`` flag in ``_PyRemoteDebuggerSupport`` to 1
    - Set ``_PY_EVAL_PLEASE_STOP_BIT`` in the ``eval_breaker`` field
 
 Once the interpreter reaches the next safe point, it will execute the script
@@ -224,6 +225,9 @@ the interpreter will execute the provided debugging code at the next safe point.
 This all happens in a completely safe context, since the interpreter is
 guaranteed to be in a consistent state whenever the eval breaker is checked.
 
+The only valid values for ``debugger_pending_call`` will initially be 0 and 1
+and other values are reserved for future use.
+
 An audit event will be raised before the code is executed, allowing this mechanism
 to be audited or disabled if desired by a system's administrator.
 
@@ -464,6 +468,21 @@ in the file path to point to somewhere attacker controlled, this would allow
 them to force their malicious code to be executed rather than the code the
 debugger intends to run.
 
+Using a Single Runtime Buffer
+-----------------------------
+
+During the review of this PEP it has been suggested using a single
+shared buffer at the runtime level for all debugger communications. While this
+appeared simpler and required less memory, we discovered it would actually prevent scenarios
+where multiple debuggers need to coordinate operations across different threads,
+or where a single debugger needs to orchestrate complex debugging operations. A
+single shared buffer would force serialization of all debugging operations,
+making it impossible for debuggers to work independently on different threads.
+
+The per-thread buffer approach, despite its memory overhead in highly threaded
+applications, enables these important debugging scenarios by allowing each
+debugger to communicate independently with its target thread. 
+
 Thanks
 ======