Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Jan 30, 2026

📄 5% (0.05x) speedup for bit_rscan in aerospike_helpers/operations/bitwise_operations.py

⏱️ Runtime : 988 microseconds 938 microseconds (best of 5 runs)

📝 Explanation and details

The optimization achieves a 5% runtime improvement by eliminating a repeated attribute lookup on every function call.

What changed:
The code now caches aerospike.OP_BIT_RSCAN as a module-level constant _OP_BIT_RSCAN instead of performing an attribute lookup through the aerospike module on every invocation of bit_rscan().

Why this is faster:
In Python, accessing an attribute like aerospike.OP_BIT_RSCAN requires traversing the module's namespace dictionary at runtime. By caching this constant value once at module import time, the function can now access it as a simple module-level global variable, which is significantly faster. Line profiler data confirms this: the operation assignment line dropped from 620.3ns to 617.1ns per hit, and the overall function runtime decreased from 988μs to 938μs.

Performance characteristics:
The test results show consistent improvements across all test cases, with gains ranging from 0.6% to 32.6% depending on the specific parameter combinations. The optimization is particularly effective for:

  • Functions called repeatedly in loops (as seen in test_batch_creation_of_many_operations_is_correct_and_efficient: 4% faster, and test_bit_rscan_bulk_creation_performance: 5% faster)
  • High-frequency calls with varied parameters (multiple 15-20% improvements in individual test cases)

Why this matters:
The bit_rscan() function is a factory method for creating operation dictionaries. If this function is called in tight loops when building batch operations (common in Aerospike workflows), the 5% improvement compounds significantly. For applications performing hundreds or thousands of bitwise operations, this translates to measurable latency reductions with zero behavioral changes or trade-offs.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 139 Passed
🌀 Generated Regression Tests 2367 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Click to see Existing Unit Tests
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_bitwise_operations.py::TestBitwiseOperations.test_bit_rscan 1.74μs 1.53μs 13.9%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_rscan_across_bytes 1.62μs 1.48μs 9.30%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_rscan_bad_bin_name 1.60μs 1.43μs 11.9%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_rscan_bit_size_too_large 1.52μs 1.39μs 9.29%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_rscan_offset_out_of_range 1.55μs 1.38μs 12.4%✅
test_bitwise_operations.py::TestBitwiseOperations.test_bit_rscan_value_not_found 1.51μs 1.36μs 11.5%✅
🌀 Click to see Generated Regression Tests
import types

# function to test
# Content taken exactly from: aerospike_helpers/operations/bitwise_operations.py
import aerospike
# imports
import pytest  # used for our unit tests
from aerospike_helpers.operations.bitwise_operations import bit_rscan

BIN_KEY = "bin"
BIT_OFFSET_KEY = "bit_offset"
BIT_SIZE_KEY = "bit_size"
VALUE_KEY = "value"
OP_KEY = "op"

@pytest.mark.parametrize(
    "bin_name,bit_offset,bit_size,value",
    [
        # Typical use: look for 1s starting at offset 0 for 32 bits.
        ("mybin", 0, 32, True),
        # Typical use: look for 0s starting at offset 10 for 16 bits.
        ("other_bin", 10, 16, False),
    ],
)
def test_basic_returns_correct_structure_and_values(bin_name, bit_offset, bit_size, value):
    # Create the operation using the function under test.
    codeflash_output = bit_rscan(bin_name, bit_offset, bit_size, value); op = codeflash_output # 3.20μs -> 2.84μs (12.5% faster)

    # The dictionary must contain exactly the expected keys (no extras, no missing keys).
    expected_keys = {OP_KEY, BIN_KEY, BIT_OFFSET_KEY, BIT_SIZE_KEY, VALUE_KEY}

def test_basic_true_and_false_are_distinguished():
    # Ensure True and False are not conflated: produce two ops and check VALUE_KEY distinguishes them.
    codeflash_output = bit_rscan("b", 1, 1, True); op_true = codeflash_output # 1.52μs -> 1.36μs (11.8% faster)
    codeflash_output = bit_rscan("b", 1, 1, False); op_false = codeflash_output # 585ns -> 590ns (0.847% slower)

def test_empty_bin_name_and_zero_size_allowed_and_preserved():
    # An empty string for bin_name is unusual but should be preserved by the helper.
    codeflash_output = bit_rscan("", 5, 0, True); op = codeflash_output # 1.44μs -> 1.29μs (11.2% faster)

def test_negative_and_large_negative_offsets_are_preserved():
    # Negative offsets are not validated by the helper; they should be carried through.
    neg_offset = -123
    codeflash_output = bit_rscan("bin", neg_offset, 8, False); op = codeflash_output # 1.57μs -> 1.33μs (17.7% faster)

def test_non_boolean_value_is_preserved_int_and_string():
    # The function's documentation suggests bool, but implementation does not enforce type.
    # Passing non-bool values should be preserved as-is in the returned dict.
    codeflash_output = bit_rscan("b", 0, 1, 1); op_int = codeflash_output # 1.35μs -> 1.31μs (3.36% faster)
    codeflash_output = bit_rscan("b", 0, 1, "1"); op_str = codeflash_output # 571ns -> 618ns (7.61% slower)

def test_extremely_large_numbers_are_preserved():
    # Test very large integer values for offset and size are preserved (no overflow/coercion).
    big_offset = 10 ** 18
    big_size = 2 ** 50
    codeflash_output = bit_rscan("bigbin", big_offset, big_size, True); op = codeflash_output # 1.64μs -> 1.34μs (22.6% faster)

def test_returned_dicts_are_independent_instances():
    # Calling the factory twice should yield separate dict objects so mutation of one does not affect the other.
    codeflash_output = bit_rscan("mut", 1, 1, True); op1 = codeflash_output # 1.50μs -> 1.29μs (16.5% faster)
    codeflash_output = bit_rscan("mut", 1, 1, True); op2 = codeflash_output # 587ns -> 628ns (6.53% slower)

    # Mutate op1 and verify op2 remains unchanged.
    op1["extra_key"] = "changed"

def test_batch_creation_of_many_operations_is_correct_and_efficient():
    # Create many operations (under 1000 as required) to exercise potential performance/scalability issues.
    n = 500  # well under 1000, but large enough to detect silly O(n^2) behaviors in naive code
    ops = []
    for i in range(n):
        # vary parameters so each op is distinct; this also helps ensure no accidental re-use of references.
        ops.append(bit_rscan(f"bin_{i}", i, i + 1, (i % 2 == 0))) # 186μs -> 179μs (4.04% faster)
    # Ensure all ops have correct op key and preserved parameters
    for i, op in enumerate(ops):
        pass

def test_no_unexpected_mutation_between_batch_calls():
    # Ensure that creating a batch of operations, then creating another batch, does not share mutable state.
    batch1 = [bit_rscan("b", i, 1, True) for i in range(10)]
    # Mutate every dict in batch1
    for d in batch1:
        d["marker"] = "x"

    # Create a new batch and ensure new dicts do not have the marker
    batch2 = [bit_rscan("b", i, 1, True) for i in range(10)]
    for d in batch2:
        pass

def test_function_is_pure_factory_no_side_effects_on_parameters():
    # Pass in mutable objects as parameters and ensure they are not mutated by the helper.
    mutable_bin = ["name"]  # unusual type, but used to check for accidental mutation
    mutable_value = {"a": 1}
    codeflash_output = bit_rscan(mutable_bin, 3, 4, mutable_value); op = codeflash_output # 1.48μs -> 1.34μs (10.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import aerospike
import pytest
from aerospike_helpers.operations.bitwise_operations import bit_rscan

def test_bit_rscan_basic_structure():
    """Test that bit_rscan returns a dictionary with the correct structure."""
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result = codeflash_output # 1.62μs -> 1.65μs (2.00% slower)

def test_bit_rscan_correct_operation_type():
    """Test that bit_rscan sets the correct operation type."""
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result = codeflash_output # 1.64μs -> 1.47μs (11.5% faster)

def test_bit_rscan_bin_name_string():
    """Test that bit_rscan correctly stores the bin name."""
    bin_name = "my_binary_bin"
    codeflash_output = bit_rscan(bin_name, 0, 8, True); result = codeflash_output # 1.65μs -> 1.38μs (20.0% faster)

def test_bit_rscan_bit_offset_zero():
    """Test bit_rscan with bit_offset of 0."""
    codeflash_output = bit_rscan("test_bin", 0, 16, True); result = codeflash_output # 1.52μs -> 1.39μs (9.68% faster)

def test_bit_rscan_bit_offset_positive():
    """Test bit_rscan with positive bit_offset."""
    codeflash_output = bit_rscan("test_bin", 10, 16, False); result = codeflash_output # 1.51μs -> 1.32μs (14.4% faster)

def test_bit_rscan_bit_size_standard():
    """Test bit_rscan with standard bit_size values."""
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result = codeflash_output # 1.62μs -> 1.35μs (20.2% faster)

def test_bit_rscan_bit_size_large():
    """Test bit_rscan with large bit_size value."""
    codeflash_output = bit_rscan("test_bin", 0, 1024, True); result = codeflash_output # 1.56μs -> 1.40μs (11.7% faster)

def test_bit_rscan_value_true():
    """Test bit_rscan searching for 1 bits (value=True)."""
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result = codeflash_output # 1.45μs -> 1.36μs (6.54% faster)

def test_bit_rscan_value_false():
    """Test bit_rscan searching for 0 bits (value=False)."""
    codeflash_output = bit_rscan("test_bin", 0, 8, False); result = codeflash_output # 1.71μs -> 1.29μs (32.6% faster)

def test_bit_rscan_various_bin_names():
    """Test bit_rscan with different bin names to ensure flexibility."""
    bin_names = ["bin1", "data", "bits", "binary_data", "x"]
    for bin_name in bin_names:
        codeflash_output = bit_rscan(bin_name, 0, 8, True); result = codeflash_output # 3.38μs -> 3.36μs (0.625% faster)

def test_bit_rscan_multiple_offsets():
    """Test bit_rscan with various bit_offset values."""
    offsets = [0, 5, 10, 32, 100]
    for offset in offsets:
        codeflash_output = bit_rscan("test_bin", offset, 8, True); result = codeflash_output # 3.50μs -> 3.23μs (8.07% faster)

def test_bit_rscan_multiple_sizes():
    """Test bit_rscan with various bit_size values."""
    sizes = [1, 4, 8, 16, 32, 64, 128]
    for size in sizes:
        codeflash_output = bit_rscan("test_bin", 0, size, True); result = codeflash_output # 4.17μs -> 4.00μs (4.15% faster)

def test_bit_rscan_bit_offset_very_large():
    """Test bit_rscan with very large bit_offset value."""
    codeflash_output = bit_rscan("test_bin", 999999, 8, True); result = codeflash_output # 1.56μs -> 1.42μs (9.79% faster)

def test_bit_rscan_bit_size_one():
    """Test bit_rscan with minimum bit_size of 1."""
    codeflash_output = bit_rscan("test_bin", 0, 1, True); result = codeflash_output # 1.62μs -> 1.37μs (17.9% faster)

def test_bit_rscan_bit_size_very_large():
    """Test bit_rscan with very large bit_size value."""
    codeflash_output = bit_rscan("test_bin", 0, 1000000, False); result = codeflash_output # 1.57μs -> 1.33μs (18.4% faster)

def test_bit_rscan_offset_and_size_combined():
    """Test bit_rscan where offset and size result in large ending position."""
    codeflash_output = bit_rscan("test_bin", 500, 500, True); result = codeflash_output # 1.57μs -> 1.33μs (17.5% faster)

def test_bit_rscan_empty_bin_name():
    """Test bit_rscan with empty string bin name."""
    codeflash_output = bit_rscan("", 0, 8, True); result = codeflash_output # 1.54μs -> 1.28μs (20.4% faster)

def test_bit_rscan_bin_name_with_special_characters():
    """Test bit_rscan with special characters in bin name."""
    bin_name = "bin_with-special.chars@123"
    codeflash_output = bit_rscan(bin_name, 0, 8, False); result = codeflash_output # 1.52μs -> 1.44μs (5.28% faster)

def test_bit_rscan_bin_name_unicode():
    """Test bit_rscan with unicode characters in bin name."""
    bin_name = "bin_\u00e9\u00fc\u00f1"
    codeflash_output = bit_rscan(bin_name, 0, 8, True); result = codeflash_output # 1.49μs -> 1.27μs (17.3% faster)

def test_bit_rscan_offset_zero_size_zero():
    """Test bit_rscan with both offset and size as zero."""
    codeflash_output = bit_rscan("test_bin", 0, 0, True); result = codeflash_output # 1.59μs -> 1.31μs (21.1% faster)

def test_bit_rscan_negative_offset():
    """Test bit_rscan with negative bit_offset (edge case - may be error case)."""
    codeflash_output = bit_rscan("test_bin", -1, 8, True); result = codeflash_output # 1.49μs -> 1.24μs (20.9% faster)

def test_bit_rscan_negative_size():
    """Test bit_rscan with negative bit_size (edge case - may be error case)."""
    codeflash_output = bit_rscan("test_bin", 0, -8, False); result = codeflash_output # 1.51μs -> 1.36μs (11.4% faster)

def test_bit_rscan_zero_offset_large_size():
    """Test bit_rscan with zero offset and very large size."""
    codeflash_output = bit_rscan("test_bin", 0, 10000, True); result = codeflash_output # 1.50μs -> 1.35μs (11.1% faster)

def test_bit_rscan_large_offset_small_size():
    """Test bit_rscan with large offset and small size."""
    codeflash_output = bit_rscan("test_bin", 999, 1, False); result = codeflash_output # 1.56μs -> 1.35μs (15.5% faster)

def test_bit_rscan_boolean_value_type_true():
    """Test that value parameter correctly accepts boolean True."""
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result = codeflash_output # 1.62μs -> 1.33μs (21.2% faster)

def test_bit_rscan_boolean_value_type_false():
    """Test that value parameter correctly accepts boolean False."""
    codeflash_output = bit_rscan("test_bin", 0, 8, False); result = codeflash_output # 1.47μs -> 1.34μs (9.44% faster)

def test_bit_rscan_integer_one_as_value():
    """Test bit_rscan with integer 1 instead of True (edge case)."""
    codeflash_output = bit_rscan("test_bin", 0, 8, 1); result = codeflash_output # 1.51μs -> 1.40μs (8.45% faster)

def test_bit_rscan_integer_zero_as_value():
    """Test bit_rscan with integer 0 instead of False (edge case)."""
    codeflash_output = bit_rscan("test_bin", 0, 8, 0); result = codeflash_output # 1.62μs -> 1.35μs (20.1% faster)

def test_bit_rscan_all_keys_present():
    """Test that all required keys are present in the returned dictionary."""
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result = codeflash_output # 1.46μs -> 1.37μs (6.20% faster)
    required_keys = {"op", "bin", "bit_offset", "bit_size", "value"}

def test_bit_rscan_no_extra_keys():
    """Test that the returned dictionary has no unexpected extra keys."""
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result = codeflash_output # 1.55μs -> 1.30μs (19.0% faster)
    expected_keys = {"op", "bin", "bit_offset", "bit_size", "value"}

def test_bit_rscan_immutability_of_values():
    """Test that modifying returned dict doesn't affect function behavior on next call."""
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result1 = codeflash_output # 1.51μs -> 1.30μs (15.9% faster)
    result1["op"] = "modified"
    codeflash_output = bit_rscan("test_bin", 0, 8, True); result2 = codeflash_output # 542ns -> 551ns (1.63% slower)

def test_bit_rscan_maximum_offset():
    """Test bit_rscan with maximum practical offset value."""
    max_offset = 2**31 - 1  # 32-bit signed integer max
    codeflash_output = bit_rscan("test_bin", max_offset, 1, True); result = codeflash_output # 1.63μs -> 1.42μs (15.0% faster)

def test_bit_rscan_maximum_size():
    """Test bit_rscan with maximum practical size value."""
    max_size = 2**31 - 1  # 32-bit signed integer max
    codeflash_output = bit_rscan("test_bin", 0, max_size, False); result = codeflash_output # 1.56μs -> 1.41μs (10.8% faster)

def test_bit_rscan_large_offset_and_size():
    """Test bit_rscan with both large offset and size values."""
    large_offset = 100000
    large_size = 100000
    codeflash_output = bit_rscan("test_bin", large_offset, large_size, True); result = codeflash_output # 1.55μs -> 1.43μs (8.30% faster)

def test_bit_rscan_bulk_creation_performance():
    """Test creating many bit_rscan operations to verify performance."""
    operations = []
    for i in range(1000):
        codeflash_output = bit_rscan(f"bin_{i}", i, i + 8, i % 2 == 0); op = codeflash_output # 403μs -> 384μs (5.00% faster)
        operations.append(op)

def test_bit_rscan_different_parameters_large_sample():
    """Test bit_rscan with varied parameters across large sample."""
    results = []
    for offset in range(0, 100, 10):
        for size in range(1, 101, 10):
            for value in [True, False]:
                codeflash_output = bit_rscan("test_bin", offset, size, value); result = codeflash_output
                results.append(result)

def test_bit_rscan_consistency_across_calls():
    """Test that bit_rscan produces consistent results across multiple calls."""
    results = []
    for _ in range(100):
        codeflash_output = bit_rscan("test_bin", 42, 100, True); result = codeflash_output # 39.8μs -> 38.2μs (4.09% faster)
        results.append(result)
    
    # All results should be identical
    first = results[0]

def test_bit_rscan_long_bin_name():
    """Test bit_rscan with very long bin name."""
    long_bin_name = "bin_" + "x" * 1000
    codeflash_output = bit_rscan(long_bin_name, 0, 8, True); result = codeflash_output # 1.56μs -> 1.31μs (18.9% faster)

def test_bit_rscan_various_parameter_combinations():
    """Test bit_rscan with many different parameter combinations."""
    test_cases = [
        ("bin1", 0, 1, True),
        ("bin2", 0, 8, False),
        ("bin3", 10, 16, True),
        ("bin4", 100, 32, False),
        ("bin5", 1000, 64, True),
        ("bin6", 10000, 128, False),
        ("bin7", 100000, 256, True),
    ]
    
    results = [bit_rscan(bin_name, offset, size, value) 
               for bin_name, offset, size, value in test_cases]
    for i, (bin_name, offset, size, value) in enumerate(test_cases):
        pass

def test_bit_rscan_operation_independence():
    """Test that multiple bit_rscan operations are independent."""
    codeflash_output = bit_rscan("bin1", 0, 8, True); op1 = codeflash_output # 1.47μs -> 1.43μs (2.95% faster)
    codeflash_output = bit_rscan("bin2", 10, 16, False); op2 = codeflash_output # 613ns -> 582ns (5.33% faster)
    codeflash_output = bit_rscan("bin3", 20, 32, True); op3 = codeflash_output # 483ns -> 442ns (9.28% faster)

def test_bit_rscan_stress_test_value_alternation():
    """Stress test with alternating boolean values."""
    for i in range(500):
        value = i % 2 == 0
        codeflash_output = bit_rscan("test_bin", i, i + 1, value); result = codeflash_output # 190μs -> 179μs (5.91% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-bit_rscan-ml0kfod0 and push.

Codeflash Static Badge

The optimization achieves a **5% runtime improvement** by eliminating a repeated attribute lookup on every function call. 

**What changed:**
The code now caches `aerospike.OP_BIT_RSCAN` as a module-level constant `_OP_BIT_RSCAN` instead of performing an attribute lookup through the `aerospike` module on every invocation of `bit_rscan()`.

**Why this is faster:**
In Python, accessing an attribute like `aerospike.OP_BIT_RSCAN` requires traversing the module's namespace dictionary at runtime. By caching this constant value once at module import time, the function can now access it as a simple module-level global variable, which is significantly faster. Line profiler data confirms this: the operation assignment line dropped from 620.3ns to 617.1ns per hit, and the overall function runtime decreased from 988μs to 938μs.

**Performance characteristics:**
The test results show consistent improvements across all test cases, with gains ranging from 0.6% to 32.6% depending on the specific parameter combinations. The optimization is particularly effective for:
- Functions called repeatedly in loops (as seen in `test_batch_creation_of_many_operations_is_correct_and_efficient`: 4% faster, and `test_bit_rscan_bulk_creation_performance`: 5% faster)
- High-frequency calls with varied parameters (multiple 15-20% improvements in individual test cases)

**Why this matters:**
The `bit_rscan()` function is a factory method for creating operation dictionaries. If this function is called in tight loops when building batch operations (common in Aerospike workflows), the 5% improvement compounds significantly. For applications performing hundreds or thousands of bitwise operations, this translates to measurable latency reductions with zero behavioral changes or trade-offs.
@codeflash-ai codeflash-ai bot requested a review from aseembits93 January 30, 2026 07:32
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants