Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Jan 30, 2026

📄 17% (0.17x) speedup for bit_get_int in aerospike_helpers/operations/bitwise_operations.py

⏱️ Runtime : 608 microseconds 520 microseconds (best of 5 runs)

📝 Explanation and details

The optimization achieves a 17% runtime improvement by eliminating repeated attribute lookups of aerospike.OP_BIT_GET_INT.

Key Change:
The constant aerospike.OP_BIT_GET_INT is cached at module-level as _OP_BIT_GET_INT, transforming what was previously an attribute lookup on every function call into a simple local variable reference.

Why This Works:
In Python, attribute lookups (like aerospike.OP_BIT_GET_INT) involve dictionary searches in the module's __dict__ at runtime. By caching this constant value once at module import time, each call to bit_get_int() avoids this lookup overhead. While a single attribute lookup is fast, when a function is called repeatedly (as evidenced by the 1,314 hits in the profiler and the bulk operation tests with 100-500 iterations), these microseconds accumulate significantly.

Performance Impact:
The line profiler shows the dictionary construction line improved from 844,072 ns to 799,935 ns (5% faster on that line alone). More importantly, the test results demonstrate consistent 10-40% improvements per call, with the most dramatic gains in:

  • Bulk operations: 100-500 call sequences show 14-17% improvements
  • High-frequency scenarios: Tests with multiple sequential calls benefit most from eliminating the repeated lookup overhead

Workload Suitability:
This optimization is particularly effective for:

  • Applications that generate many bitwise operations in tight loops
  • Batch processing scenarios where bit_get_int() is called hundreds of times
  • Performance-critical paths where microsecond savings matter at scale

The optimization maintains identical behavior—all dictionary values, types, and structure are preserved—while providing measurable runtime reduction through a simple constant caching strategy.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 139 Passed
🌀 Generated Regression Tests 1305 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Click to see Existing Unit Tests
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int 1.78μs 1.41μs 26.8%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int_accross_bytes 1.73μs 1.31μs 32.2%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int_bad_argument_type 1.61μs 1.35μs 19.1%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int_bad_bin_name 1.65μs 1.16μs 42.1%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int_bit_offset_out_of_range 1.72μs 1.28μs 33.9%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int_bit_size_too_large 1.66μs 1.28μs 29.5%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int_fraction_of_byte 1.72μs 1.31μs 30.9%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int_multiple_bytes 1.62μs 1.31μs 23.7%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_get_int_signed 1.66μs 1.30μs 28.1%✅
🌀 Click to see Generated Regression Tests
import sys  # used to inject a lightweight 'aerospike' module if not present
from types import ModuleType  # used to create a minimal module object

# function to test
import aerospike
import pytest  # used for our unit tests
from aerospike_helpers.operations.bitwise_operations import bit_get_int

def test_basic_functionality_preserves_values_and_types():
    # Basic successful call with typical inputs.
    codeflash_output = bit_get_int("mybin", 5, 8, True); result = codeflash_output # 1.56μs -> 1.38μs (13.6% faster)

def test_mutation_isolation_between_calls():
    # Ensure each call returns a new dictionary object so mutation of one does not affect another.
    codeflash_output = bit_get_int("b", 0, 1, False); a = codeflash_output # 1.56μs -> 1.35μs (15.5% faster)
    codeflash_output = bit_get_int("b", 0, 1, False); b = codeflash_output # 655ns -> 509ns (28.7% faster)
    # Mutate one result.
    a["bin"] = "changed"

def test_zero_and_negative_values_are_preserved():
    # The implementation does no validation, so zero and negative numbers should be preserved as provided.
    codeflash_output = bit_get_int("", 0, 0, 0); zero = codeflash_output # 1.60μs -> 1.12μs (42.5% faster)

    codeflash_output = bit_get_int("neg", -10, -5, True); negative = codeflash_output # 898ns -> 789ns (13.8% faster)

def test_non_string_bin_and_float_inputs_are_preserved():
    # The function signature suggests bin_name should be a str, but there is no enforcement.
    # Passing bytes should be preserved exactly.
    b = b"bytes_bin"
    codeflash_output = bit_get_int(b, 7, 3, False); r1 = codeflash_output # 1.64μs -> 1.26μs (30.0% faster)

    # Floating point offsets and sizes (even though semantically unusual) should be returned unchanged.
    codeflash_output = bit_get_int("floaty", 3.5, 2.0, False); r2 = codeflash_output # 1.00μs -> 870ns (15.3% faster)

def test_sign_accepts_non_boolean_values_and_preserves_them():
    # Passing non-boolean truthy/falsy values should be preserved as-is.
    codeflash_output = bit_get_int("s", 1, 1, 1); r_true_like = codeflash_output # 1.63μs -> 1.21μs (34.3% faster)
    codeflash_output = bit_get_int("s", 1, 1, 0); r_false_like = codeflash_output # 664ns -> 552ns (20.3% faster)

def test_many_calls_scalability_and_consistency():
    # Perform a larger number of calls to ensure no performance regressions or state leakage.
    results = []
    N = 500  # well under the 1000-call limit
    for i in range(N):
        # Alternate boolean sign to get variation; use different bin names.
        codeflash_output = bit_get_int(f"bin{i}", i, i % 64, bool(i % 2)); res = codeflash_output # 209μs -> 178μs (17.3% faster)
        results.append(res)

def test_exact_keys_no_extra_fields():
    # Confirms that the function returns exactly the expected fields and no extras.
    codeflash_output = bit_get_int("k", 1, 1, False); r = codeflash_output # 1.72μs -> 1.22μs (41.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import aerospike
import pytest
from aerospike_helpers.operations.bitwise_operations import bit_get_int

def test_basic_unsigned_integer_operation():
    """Test basic bit_get_int operation with unsigned integer."""
    codeflash_output = bit_get_int("test_bin", 0, 8, False); result = codeflash_output # 2.39μs -> 1.73μs (38.4% faster)

def test_basic_signed_integer_operation():
    """Test basic bit_get_int operation with signed integer."""
    codeflash_output = bit_get_int("my_bin", 16, 32, True); result = codeflash_output # 1.52μs -> 1.38μs (10.5% faster)

def test_operation_code_is_correct():
    """Test that the operation code is set to OP_BIT_GET_INT."""
    codeflash_output = bit_get_int("bin", 0, 8, False); result = codeflash_output # 1.61μs -> 1.13μs (42.8% faster)

def test_different_bit_offsets():
    """Test bit_get_int with various bit_offset values."""
    test_offsets = [0, 1, 8, 16, 32, 64, 128, 256, 512, 1024]
    
    for offset in test_offsets:
        codeflash_output = bit_get_int("bin", offset, 8, False); result = codeflash_output # 5.92μs -> 5.18μs (14.3% faster)

def test_different_bit_sizes():
    """Test bit_get_int with various bit_size values."""
    test_sizes = [1, 4, 8, 16, 32, 48, 64]
    
    for size in test_sizes:
        codeflash_output = bit_get_int("bin", 0, size, False); result = codeflash_output # 4.25μs -> 3.68μs (15.4% faster)

def test_empty_bin_name():
    """Test bit_get_int with empty string as bin name."""
    codeflash_output = bit_get_int("", 0, 8, False); result = codeflash_output # 1.56μs -> 1.26μs (23.7% faster)

def test_long_bin_name():
    """Test bit_get_int with a very long bin name."""
    long_name = "a" * 1000
    codeflash_output = bit_get_int(long_name, 0, 8, False); result = codeflash_output # 1.64μs -> 1.28μs (27.7% faster)

def test_special_characters_in_bin_name():
    """Test bit_get_int with special characters in bin name."""
    special_names = [
        "bin-with-dashes",
        "bin_with_underscores",
        "bin.with.dots",
        "bin:with:colons",
        "bin@with#special$chars",
        "日本語bin",
        "bin\twith\ttabs",
    ]
    
    for name in special_names:
        codeflash_output = bit_get_int(name, 0, 8, False); result = codeflash_output # 4.75μs -> 3.96μs (20.0% faster)

def test_sign_parameter_true():
    """Test bit_get_int with sign parameter set to True."""
    codeflash_output = bit_get_int("bin", 0, 8, True); result = codeflash_output # 1.65μs -> 1.30μs (26.4% faster)

def test_sign_parameter_false():
    """Test bit_get_int with sign parameter set to False."""
    codeflash_output = bit_get_int("bin", 0, 8, False); result = codeflash_output # 1.50μs -> 1.26μs (19.1% faster)

def test_zero_bit_offset():
    """Test bit_get_int with zero bit_offset."""
    codeflash_output = bit_get_int("bin", 0, 8, False); result = codeflash_output # 1.66μs -> 1.23μs (35.4% faster)

def test_large_bit_offset():
    """Test bit_get_int with large bit_offset value."""
    large_offset = 1000000000
    codeflash_output = bit_get_int("bin", large_offset, 8, False); result = codeflash_output # 1.68μs -> 1.24μs (35.4% faster)

def test_minimum_bit_size():
    """Test bit_get_int with minimum bit_size of 1."""
    codeflash_output = bit_get_int("bin", 0, 1, False); result = codeflash_output # 1.60μs -> 1.26μs (27.7% faster)

def test_large_bit_size():
    """Test bit_get_int with large bit_size value."""
    large_size = 1000000
    codeflash_output = bit_get_int("bin", 0, large_size, False); result = codeflash_output # 1.64μs -> 1.29μs (26.7% faster)

def test_combined_large_parameters():
    """Test bit_get_int with both large offset and size."""
    codeflash_output = bit_get_int("bin", 1000000, 100000, True); result = codeflash_output # 1.70μs -> 1.27μs (33.8% faster)

def test_negative_bit_offset():
    """Test bit_get_int with negative bit_offset."""
    # Negative offsets should still be accepted by the function
    # (server will validate if they're valid)
    codeflash_output = bit_get_int("bin", -1, 8, False); result = codeflash_output # 1.79μs -> 1.14μs (57.7% faster)

def test_negative_bit_size():
    """Test bit_get_int with negative bit_size."""
    # Negative sizes should still be accepted by the function
    # (server will validate if they're valid)
    codeflash_output = bit_get_int("bin", 0, -8, False); result = codeflash_output # 1.74μs -> 1.21μs (43.9% faster)

def test_zero_bit_size():
    """Test bit_get_int with zero bit_size."""
    codeflash_output = bit_get_int("bin", 0, 0, False); result = codeflash_output # 1.60μs -> 1.29μs (23.6% faster)

def test_return_dictionary_keys_exact():
    """Test that returned dictionary has exactly the expected keys."""
    codeflash_output = bit_get_int("bin", 0, 8, False); result = codeflash_output # 1.65μs -> 1.30μs (27.1% faster)
    
    expected_keys = {"op", "bin", "bit_offset", "bit_size", "sign"}
    actual_keys = set(result.keys())

def test_return_value_immutability_of_parameters():
    """Test that the returned dictionary preserves original parameter values."""
    bin_name = "original_bin"
    offset = 42
    size = 16
    sign = True
    
    codeflash_output = bit_get_int(bin_name, offset, size, sign); result = codeflash_output # 1.69μs -> 1.28μs (32.4% faster)

def test_multiple_operations_independent():
    """Test that multiple bit_get_int calls produce independent results."""
    codeflash_output = bit_get_int("bin1", 0, 8, False); op1 = codeflash_output # 1.72μs -> 1.31μs (31.0% faster)
    codeflash_output = bit_get_int("bin2", 16, 32, True); op2 = codeflash_output # 628ns -> 568ns (10.6% faster)
    
    # Modify one and verify other is unchanged
    op1["bin"] = "modified"

def test_bulk_operation_generation():
    """Test generating many bit_get_int operations in bulk."""
    operations = []
    
    # Generate 100 operations with different parameters
    for i in range(100):
        codeflash_output = bit_get_int(f"bin_{i}", i * 8, 8 + (i % 8), i % 2 == 0); op = codeflash_output # 42.4μs -> 37.1μs (14.3% faster)
        operations.append(op)

def test_sequential_operations_same_bin():
    """Test creating sequential operations on the same bin."""
    operations = []
    bin_name = "data_bin"
    
    # Create 50 operations on same bin with different offsets
    for i in range(50):
        codeflash_output = bit_get_int(bin_name, i * 16, 16, False); op = codeflash_output # 21.7μs -> 18.9μs (14.5% faster)
        operations.append(op)
    
    # Verify all reference the same bin
    for op in operations:
        pass
    
    # Verify offsets are sequential
    for i, op in enumerate(operations):
        pass

def test_large_offset_and_size_combination():
    """Test with maximum reasonable bit offset and size values."""
    codeflash_output = bit_get_int("bin", 999999999, 999999, True); result = codeflash_output # 1.64μs -> 1.35μs (21.1% faster)

def test_single_bit_extraction():
    """Test extracting a single bit."""
    codeflash_output = bit_get_int("bin", 0, 1, False); result = codeflash_output # 1.49μs -> 1.18μs (26.1% faster)

def test_64_bit_integer_extraction():
    """Test extracting a full 64-bit integer."""
    codeflash_output = bit_get_int("bin", 0, 64, True); result = codeflash_output # 1.62μs -> 1.24μs (31.0% faster)

def test_operation_code_consistency():
    """Test that operation code is consistent across all calls."""
    expected_op = aerospike.OP_BIT_GET_INT
    
    for i in range(50):
        codeflash_output = bit_get_int("bin", i, i + 1, i % 2 == 0); result = codeflash_output # 20.8μs -> 18.6μs (11.8% faster)

def test_unicode_bin_names():
    """Test bit_get_int with unicode characters in bin names."""
    unicode_names = [
        "日本語",
        "中文",
        "한국어",
        "العربية",
        "עברית",
        "Ελληνικά",
        "Русский",
        "🔥bin🔥",
    ]
    
    for name in unicode_names:
        codeflash_output = bit_get_int(name, 0, 8, False); result = codeflash_output # 4.62μs -> 4.02μs (14.9% faster)

def test_very_long_operation_chain():
    """Test creating a very long chain of operations."""
    operations = []
    
    # Generate 500 operations
    for i in range(500):
        codeflash_output = bit_get_int(
            f"bin_{i % 10}",
            (i * 8) % 1000000,
            (i % 64) + 1,
            i % 2 == 0
        ); op = codeflash_output # 207μs -> 180μs (14.9% faster)
        operations.append(op)
    
    # Verify each has correct structure
    for op in operations:
        pass

def test_parameter_types_preserved():
    """Test that parameter types are preserved in the result."""
    codeflash_output = bit_get_int("bin", 100, 32, True); result = codeflash_output # 1.60μs -> 1.23μs (29.5% faster)

def test_float_like_integers():
    """Test bit_get_int with integer values that could be floats."""
    # Python allows passing floats that are whole numbers
    codeflash_output = bit_get_int("bin", int(16.0), int(32.0), False); result = codeflash_output # 1.72μs -> 1.25μs (38.0% faster)

def test_boundary_bit_positions():
    """Test bit operations at byte boundaries."""
    # Byte boundaries: 0, 8, 16, 24, 32, etc.
    byte_boundaries = [0, 8, 16, 24, 32, 64, 128, 256, 512]
    
    for boundary in byte_boundaries:
        codeflash_output = bit_get_int("bin", boundary, 8, False); result = codeflash_output # 5.49μs -> 4.51μs (21.7% faster)

def test_non_byte_aligned_operations():
    """Test bit operations that don't align to byte boundaries."""
    non_aligned = [1, 3, 5, 7, 9, 11, 13, 15, 17, 31]
    
    for offset in non_aligned:
        codeflash_output = bit_get_int("bin", offset, 3, False); result = codeflash_output # 5.28μs -> 4.68μs (12.8% faster)

def test_mixed_sign_parameters_sequence():
    """Test a sequence of operations with alternating sign parameters."""
    operations = []
    
    for i in range(20):
        sign = i % 2 == 0
        codeflash_output = bit_get_int("bin", i * 8, 8, sign); op = codeflash_output # 9.22μs -> 8.30μs (11.1% faster)
        operations.append(op)
    
    # Verify alternating sign values
    for i, op in enumerate(operations):
        expected_sign = i % 2 == 0
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-bit_get_int-ml0ja6yl and push.

Codeflash Static Badge

The optimization achieves a **17% runtime improvement** by eliminating repeated attribute lookups of `aerospike.OP_BIT_GET_INT`. 

**Key Change:**
The constant `aerospike.OP_BIT_GET_INT` is cached at module-level as `_OP_BIT_GET_INT`, transforming what was previously an attribute lookup on every function call into a simple local variable reference.

**Why This Works:**
In Python, attribute lookups (like `aerospike.OP_BIT_GET_INT`) involve dictionary searches in the module's `__dict__` at runtime. By caching this constant value once at module import time, each call to `bit_get_int()` avoids this lookup overhead. While a single attribute lookup is fast, when a function is called repeatedly (as evidenced by the 1,314 hits in the profiler and the bulk operation tests with 100-500 iterations), these microseconds accumulate significantly.

**Performance Impact:**
The line profiler shows the dictionary construction line improved from 844,072 ns to 799,935 ns (5% faster on that line alone). More importantly, the test results demonstrate consistent 10-40% improvements per call, with the most dramatic gains in:
- Bulk operations: 100-500 call sequences show 14-17% improvements
- High-frequency scenarios: Tests with multiple sequential calls benefit most from eliminating the repeated lookup overhead

**Workload Suitability:**
This optimization is particularly effective for:
- Applications that generate many bitwise operations in tight loops
- Batch processing scenarios where `bit_get_int()` is called hundreds of times
- Performance-critical paths where microsecond savings matter at scale

The optimization maintains identical behavior—all dictionary values, types, and structure are preserved—while providing measurable runtime reduction through a simple constant caching strategy.
@codeflash-ai codeflash-ai bot requested a review from aseembits93 January 30, 2026 07:00
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants