Skip to content

⚡️ Speed up function read_key by 10%#71

Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-read_key-mgltemzh
Open

⚡️ Speed up function read_key by 10%#71
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-read_key-mgltemzh

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 11, 2025

📄 10% (0.10x) speedup for read_key in graphrag/config/environment_reader.py

⏱️ Runtime : 1.09 milliseconds 992 microseconds (best of 117 runs)

📝 Explanation and details

The optimization replaces isinstance(value, str) with type(value) is str for type checking, which provides a 9% speedup.

Key Change:

  • Changed if not isinstance(value, str): to if type(value) is str: and inverted the logic accordingly

Why This is Faster:
isinstance() performs more complex checks including inheritance hierarchy traversal, while type(value) is str does a direct type comparison using object identity. The is operator is faster than == for type comparison, and type() avoids the overhead of checking parent classes that isinstance() performs.

Performance Impact:
The line profiler shows the type check line improved from 261ns per hit to 236ns per hit (9.6% faster per check). With 4,054 function calls in the benchmark, this micro-optimization compounds to meaningful savings.

Test Case Performance:
The optimization works particularly well for:

  • Simple string inputs (7-35% faster)
  • Empty strings (26-28% faster)
  • Unicode strings (4-9% faster)
  • Large strings with repeated processing (10% faster)

Since this function appears to be called frequently in configuration processing, the cumulative effect of faster type checking provides measurable performance gains across all string input scenarios.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2025 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from enum import Enum

# imports
import pytest  # used for our unit tests
from graphrag.config.environment_reader import read_key

# function to test
# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License

KeyValue = str | Enum
from graphrag.config.environment_reader import read_key

# unit tests

# --- Basic Test Cases ---

def test_read_key_with_basic_string():
    # Test with a simple lowercase string
    codeflash_output = read_key("hello") # 607ns -> 556ns (9.17% faster)
    # Test with a simple uppercase string
    codeflash_output = read_key("HELLO") # 299ns -> 221ns (35.3% faster)
    # Test with a mixed-case string
    codeflash_output = read_key("HeLLo") # 176ns -> 154ns (14.3% faster)
    # Test with a string containing spaces
    codeflash_output = read_key("He lLo") # 204ns -> 197ns (3.55% faster)
    # Test with a string containing digits
    codeflash_output = read_key("HeLLo123") # 194ns -> 197ns (1.52% slower)
    # Test with a string containing special characters
    codeflash_output = read_key("!@#Hello$%^") # 180ns -> 168ns (7.14% faster)



def test_read_key_with_empty_string():
    # Test with an empty string
    codeflash_output = read_key("") # 614ns -> 485ns (26.6% faster)


def test_read_key_with_string_with_only_whitespace():
    # Test with a string of only spaces
    codeflash_output = read_key("   ") # 619ns -> 462ns (34.0% faster)
    # Test with a string of tabs and newlines
    codeflash_output = read_key("\t\n") # 346ns -> 290ns (19.3% faster)






def test_read_key_with_non_enum_non_str_type():
    # Should raise AttributeError because int has no .lower()
    with pytest.raises(AttributeError):
        read_key(123) # 1.26μs -> 1.20μs (5.00% faster)
    # Should raise AttributeError because list has no .lower()
    with pytest.raises(AttributeError):
        read_key(["HELLO"]) # 773ns -> 806ns (4.09% slower)


def test_read_key_with_large_string():
    # Test with a large string (1000 characters, mixed case)
    s = "AbC" * 333 + "Z"
    expected = s.lower()
    codeflash_output = read_key(s) # 962ns -> 928ns (3.66% faster)



def test_read_key_with_long_unicode_string():
    # Test with a long unicode string
    s = "你好世界" * 250
    codeflash_output = read_key(s) # 3.17μs -> 3.03μs (4.42% faster)


#------------------------------------------------
from enum import Enum

# imports
import pytest  # used for our unit tests
from graphrag.config.environment_reader import read_key

# function to test
# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License

KeyValue = str | Enum
from graphrag.config.environment_reader import read_key

# unit tests

# --- Basic Test Cases ---

def test_read_key_with_simple_string():
    # Should lowercase a simple string
    codeflash_output = read_key("Hello") # 600ns -> 558ns (7.53% faster)

def test_read_key_with_lowercase_string():
    # Should return the same string if already lowercase
    codeflash_output = read_key("world") # 552ns -> 503ns (9.74% faster)

def test_read_key_with_mixed_case_string():
    # Should lowercase mixed case
    codeflash_output = read_key("PyThOn") # 619ns -> 506ns (22.3% faster)

def test_read_key_with_empty_string():
    # Should return empty string for empty input
    codeflash_output = read_key("") # 650ns -> 508ns (28.0% faster)




def test_read_key_with_unicode_string():
    # Should lowercase unicode characters correctly
    codeflash_output = read_key("Äpfel") # 905ns -> 827ns (9.43% faster)


def test_read_key_with_numeric_string():
    # Should return the same numeric string (no change)
    codeflash_output = read_key("12345") # 628ns -> 519ns (21.0% faster)

def test_read_key_with_special_characters():
    # Should lowercase only alphabetic, leave others unchanged
    codeflash_output = read_key("!@#ABC$%^") # 594ns -> 512ns (16.0% faster)




def test_read_key_with_long_string():
    # Should lowercase a long string
    long_str = "A" * 500 + "B" * 500
    expected = "a" * 500 + "b" * 500
    codeflash_output = read_key(long_str) # 1.30μs -> 1.18μs (10.1% faster)



def test_read_key_with_non_str_non_enum():
    # Should raise AttributeError if input is not str or Enum
    with pytest.raises(AttributeError):
        read_key(12345) # 1.28μs -> 1.29μs (0.465% slower)


def test_read_key_with_large_list_of_strings():
    # Should correctly lowercase many strings in a loop
    data = [f"Test{i}ABC" for i in range(1000)]
    expected = [f"test{i}abc" for i in range(1000)]
    for s, exp in zip(data, expected):
        codeflash_output = read_key(s) # 164μs -> 148μs (10.5% faster)


def test_read_key_performance_large_string():
    # Should handle a very large string efficiently
    large_str = "AbC" * 1000  # 3000 characters
    expected = "abc" * 1000
    codeflash_output = read_key(large_str) # 1.98μs -> 1.96μs (1.02% faster)


#------------------------------------------------
from graphrag.config.environment_reader import read_key

def test_read_key():
    read_key('')
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_3eu3lmds/tmpiphwep05/test_concolic_coverage.py::test_read_key 781ns 652ns 19.8%✅

To edit these changes git checkout codeflash/optimize-read_key-mgltemzh and push.

Codeflash

The optimization replaces `isinstance(value, str)` with `type(value) is str` for type checking, which provides a 9% speedup.

**Key Change:**
- Changed `if not isinstance(value, str):` to `if type(value) is str:` and inverted the logic accordingly

**Why This is Faster:**
`isinstance()` performs more complex checks including inheritance hierarchy traversal, while `type(value) is str` does a direct type comparison using object identity. The `is` operator is faster than `==` for type comparison, and `type()` avoids the overhead of checking parent classes that `isinstance()` performs.

**Performance Impact:**
The line profiler shows the type check line improved from 261ns per hit to 236ns per hit (9.6% faster per check). With 4,054 function calls in the benchmark, this micro-optimization compounds to meaningful savings.

**Test Case Performance:**
The optimization works particularly well for:
- Simple string inputs (7-35% faster)
- Empty strings (26-28% faster) 
- Unicode strings (4-9% faster)
- Large strings with repeated processing (10% faster)

Since this function appears to be called frequently in configuration processing, the cumulative effect of faster type checking provides measurable performance gains across all string input scenarios.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 11, 2025 05:08
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants