From 3bf2baf0790ce1dd86f59b196e51a7be2e600647 Mon Sep 17 00:00:00 2001 From: faizanoor3001 Date: Tue, 4 Mar 2025 13:31:55 +0200 Subject: [PATCH 1/6] Updated the interpreters.md with the how to add a new bytecode specialization steps --- InternalDocs/interpreter.md | 56 +++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/InternalDocs/interpreter.md b/InternalDocs/interpreter.md index 7195d9c6de575c..3f2d947878b0ed 100644 --- a/InternalDocs/interpreter.md +++ b/InternalDocs/interpreter.md @@ -505,6 +505,62 @@ After the last `DEOPT_IF` has passed, a hit should be recorded with `STAT_INC(BASE_INSTRUCTION, hit)`. After an optimization has been deferred in the adaptive instruction, that should be recorded with `STAT_INC(BASE_INSTRUCTION, deferred)`. +## How to add a new bytecode specialization + +Assuming you found an instruction that serves as a good specialization candidate. +Let's use the example of [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op): + +1. Update below in [Python/bytecodes.c](../Python/bytecodes.c) + +- Convert `CONTAINS_OP` to a micro-operation (uop) by renaming + it to `_CONTAINS_OP` and changing the instruction definition + from `inst` to `op`. + + ```c + // Before + inst(CONTAINS_OP, ...); + + // After + op(_CONTAINS_OP, ...); + ``` + +- Add a uop that calls the specializing function `_SPECIALIZE_CONTAINS_OP`. + For example. + + ```c + specializing op(_SPECIALIZE_CONTAINS_OP, (counter/1, left, right -- left, right)) { + #if ENABLE_SPECIALIZATION + if (ADAPTIVE_COUNTER_IS_ZERO(counter)) { + next_instr = this_instr; + _Py_Specialize_ContainsOp(right, next_instr); + DISPATCH_SAME_OPARG(); + } + STAT_INC(CONTAINS_OP, deferred); + DECREMENT_ADAPTIVE_COUNTER(this_instr[1].cache); + #endif /* ENABLE_SPECIALIZATION */ + } + ``` + +- The original `CONTAINS_OP` is now a new macro consisting of + `_SPECIALIZE_CONTAINS_OP` and `_CONTAINS_OP`. + +2. Define the cache structure in [Include/internal/pycore_code.h](../Include/internal/pycore_code.h), +at the very least, a 16-bit counter is needed. + + ```c + typedef struct { + uint16_t counter; + } _PyContainsOpCache; + ``` + +3. Write the specializing function itself in [Python/specialize.c ](../Python/specialize.c). + Refer to any other function in that file for the format. +4. Remember to update operation stats by calling add_stat_dict in + [Python/specialize.c ](../Python/specialize.c). +5. Add the cache layout in [Lib/opcode.py](../Lib/opcode.py) so that Python's + dis module will know how to represent it properly. +6. Bump magic number in [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h). +7. Run ``make regen-all`` on `*nix` or `build.bat --regen` on Windows. Additional resources From a023a526beab3fdfc0e61f743b54aa49b675bacf Mon Sep 17 00:00:00 2001 From: faizanoor3001 Date: Tue, 4 Mar 2025 21:57:39 +0200 Subject: [PATCH 2/6] Addressed the review comments --- InternalDocs/interpreter.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/InternalDocs/interpreter.md b/InternalDocs/interpreter.md index 3f2d947878b0ed..ef895228ac0a51 100644 --- a/InternalDocs/interpreter.md +++ b/InternalDocs/interpreter.md @@ -505,6 +505,7 @@ After the last `DEOPT_IF` has passed, a hit should be recorded with `STAT_INC(BASE_INSTRUCTION, hit)`. After an optimization has been deferred in the adaptive instruction, that should be recorded with `STAT_INC(BASE_INSTRUCTION, deferred)`. + ## How to add a new bytecode specialization Assuming you found an instruction that serves as a good specialization candidate. @@ -555,7 +556,7 @@ at the very least, a 16-bit counter is needed. 3. Write the specializing function itself in [Python/specialize.c ](../Python/specialize.c). Refer to any other function in that file for the format. -4. Remember to update operation stats by calling add_stat_dict in +4. Remember to update operation stats by calling `add_stat_dict` in [Python/specialize.c ](../Python/specialize.c). 5. Add the cache layout in [Lib/opcode.py](../Lib/opcode.py) so that Python's dis module will know how to represent it properly. From cb03b20c0b9458804c1d993d2dde8b16e0b3e11f Mon Sep 17 00:00:00 2001 From: faizanoor3001 Date: Tue, 4 Mar 2025 23:01:48 +0200 Subject: [PATCH 3/6] Update section for Step1, reworded as per review comments for clarity --- InternalDocs/interpreter.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/InternalDocs/interpreter.md b/InternalDocs/interpreter.md index ef895228ac0a51..c84e9a92bdc9c7 100644 --- a/InternalDocs/interpreter.md +++ b/InternalDocs/interpreter.md @@ -525,8 +525,7 @@ Let's use the example of [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op): op(_CONTAINS_OP, ...); ``` -- Add a uop that calls the specializing function `_SPECIALIZE_CONTAINS_OP`. - For example. +- Add a uop that calls the specializing function: ```c specializing op(_SPECIALIZE_CONTAINS_OP, (counter/1, left, right -- left, right)) { @@ -542,8 +541,11 @@ Let's use the example of [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op): } ``` -- The original `CONTAINS_OP` is now a new macro consisting of - `_SPECIALIZE_CONTAINS_OP` and `_CONTAINS_OP`. +- Create a macro for the original bytecode name: + + ```c + macro(CONTAINS_OP) = _SPECIALIZE_CONTAINS_OP + _CONTAINS_OP; + ``` 2. Define the cache structure in [Include/internal/pycore_code.h](../Include/internal/pycore_code.h), at the very least, a 16-bit counter is needed. From d9783ca52b10307ded7ea1fa608a9878ca404e0b Mon Sep 17 00:00:00 2001 From: faizanoor3001 Date: Tue, 4 Mar 2025 23:16:33 +0200 Subject: [PATCH 4/6] Added new lines between steps for better readability --- InternalDocs/interpreter.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/InternalDocs/interpreter.md b/InternalDocs/interpreter.md index c84e9a92bdc9c7..bc1a679bc6a68f 100644 --- a/InternalDocs/interpreter.md +++ b/InternalDocs/interpreter.md @@ -558,11 +558,15 @@ at the very least, a 16-bit counter is needed. 3. Write the specializing function itself in [Python/specialize.c ](../Python/specialize.c). Refer to any other function in that file for the format. + 4. Remember to update operation stats by calling `add_stat_dict` in [Python/specialize.c ](../Python/specialize.c). + 5. Add the cache layout in [Lib/opcode.py](../Lib/opcode.py) so that Python's dis module will know how to represent it properly. + 6. Bump magic number in [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h). + 7. Run ``make regen-all`` on `*nix` or `build.bat --regen` on Windows. From 852cbaf9d98def8f8cb8041a5d10fa32410e16d4 Mon Sep 17 00:00:00 2001 From: faizanoor3001 Date: Tue, 4 Mar 2025 23:40:42 +0200 Subject: [PATCH 5/6] Updated the text for the steps 3,4,5 as per comment --- InternalDocs/interpreter.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/InternalDocs/interpreter.md b/InternalDocs/interpreter.md index bc1a679bc6a68f..0e43fc1634cdcd 100644 --- a/InternalDocs/interpreter.md +++ b/InternalDocs/interpreter.md @@ -556,14 +556,13 @@ at the very least, a 16-bit counter is needed. } _PyContainsOpCache; ``` -3. Write the specializing function itself in [Python/specialize.c ](../Python/specialize.c). - Refer to any other function in that file for the format. +3. Write the specializing function itself (`_Py_Specialize_ContainsOp`) in [Python/specialize.c ](../Python/specialize.c). +Refer to other functions in that file for the pattern. -4. Remember to update operation stats by calling `add_stat_dict` in - [Python/specialize.c ](../Python/specialize.c). +4. Add a call to `add_stat_dict` in `_Py_GetSpecializationStats` which is in [Python/specialize.c ](../Python/specialize.c). 5. Add the cache layout in [Lib/opcode.py](../Lib/opcode.py) so that Python's - dis module will know how to represent it properly. + `dis` module will know how to represent it properly. 6. Bump magic number in [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h). From 947a2d4bf993ef9aea8683032bd3fb150b77f96b Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Tue, 14 Oct 2025 11:33:25 +0300 Subject: [PATCH 6/6] Apply suggestions from code review Co-authored-by: Ken Jin Co-authored-by: Peter Bierma Co-authored-by: Brandt Bucher --- InternalDocs/interpreter.md | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/InternalDocs/interpreter.md b/InternalDocs/interpreter.md index 0e43fc1634cdcd..75050cf9c5b648 100644 --- a/InternalDocs/interpreter.md +++ b/InternalDocs/interpreter.md @@ -506,16 +506,15 @@ After the last `DEOPT_IF` has passed, a hit should be recorded with After an optimization has been deferred in the adaptive instruction, that should be recorded with `STAT_INC(BASE_INSTRUCTION, deferred)`. + ## How to add a new bytecode specialization -Assuming you found an instruction that serves as a good specialization candidate. -Let's use the example of [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op): +Let's say you found an instruction that serves as a good specialization candidate, such as [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op): -1. Update below in [Python/bytecodes.c](../Python/bytecodes.c) +1. Make necessary changes to the instruction in [Python/bytecodes.c](../Python/bytecodes.c) -- Convert `CONTAINS_OP` to a micro-operation (uop) by renaming - it to `_CONTAINS_OP` and changing the instruction definition - from `inst` to `op`. +- Convert the instruction (`CONTAINS_OP`, in our example) to a micro-operation (uop, formally μop) by renaming it to `_INSTRUCTION_NAME` (e.g., `_CONTAINS_OP`) and changing the instruction definition +from `inst` to `op`. ```c // Before @@ -532,7 +531,7 @@ Let's use the example of [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op): #if ENABLE_SPECIALIZATION if (ADAPTIVE_COUNTER_IS_ZERO(counter)) { next_instr = this_instr; - _Py_Specialize_ContainsOp(right, next_instr); + _Py_Specialize_ContainsOp(left, right, next_instr); DISPATCH_SAME_OPARG(); } STAT_INC(CONTAINS_OP, deferred); @@ -541,14 +540,13 @@ Let's use the example of [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op): } ``` -- Create a macro for the original bytecode name: +- Create a macro for the original instruction name: ```c macro(CONTAINS_OP) = _SPECIALIZE_CONTAINS_OP + _CONTAINS_OP; ``` -2. Define the cache structure in [Include/internal/pycore_code.h](../Include/internal/pycore_code.h), -at the very least, a 16-bit counter is needed. +2. Define the cache structure in [Include/internal/pycore_code.h](../Include/internal/pycore_code.h). It needs to have at least a 16-bit counter field. ```c typedef struct { @@ -556,15 +554,15 @@ at the very least, a 16-bit counter is needed. } _PyContainsOpCache; ``` -3. Write the specializing function itself (`_Py_Specialize_ContainsOp`) in [Python/specialize.c ](../Python/specialize.c). +3. Write the specializing function itself (e.g., `_Py_Specialize_ContainsOp`) in [Python/specialize.c ](../Python/specialize.c). Refer to other functions in that file for the pattern. -4. Add a call to `add_stat_dict` in `_Py_GetSpecializationStats` which is in [Python/specialize.c ](../Python/specialize.c). +4. Add a call to `add_stat_dict` in `_Py_GetSpecializationStats` which is in [Python/specialize.c ](../Python/specialize.c). -5. Add the cache layout in [Lib/opcode.py](../Lib/opcode.py) so that Python's - `dis` module will know how to represent it properly. +5. Add the cache layout to [Lib/opcode.py](../Lib/opcode.py) so that the +`dis` module will know how to represent it properly. -6. Bump magic number in [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h). +6. Bump magic number in [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h). 7. Run ``make regen-all`` on `*nix` or `build.bat --regen` on Windows.