FIX: aggregating profiling data from duplicate code objects #349

TTsangSC · 2025-06-08T02:09:38Z

Closes #348.

Code changes

line_profiler/_line_profiler.pyx::LineProfiler.get_stats():
Refactored to aggregate profiling data for duplicate (but distinct) code objects, which have separate entries in .c_code_map and .code_hash_map but is hashed to the same key by label() in the output LineStats object

Test changes

tests/test_line_profiler.py::test_duplicate_code_objects():
New test that the bug described in the issue is fixed

codecov · 2025-06-08T02:25:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.36%. Comparing base (97ecd5c) to head (9bcc961).
Report is 19 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #349   +/-   ##
=======================================
  Coverage   87.36%   87.36%           
=======================================
  Files          18       18           
  Lines        1630     1630           
  Branches      347      347           
=======================================
  Hits         1424     1424           
  Misses        151      151           
  Partials       55       55

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ffc0f1...9bcc961. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

TTsangSC · 2025-06-08T02:30:32Z

Hi @Theelx, when working on this I noticed that there were comments that you (or your collaborator @ichichkine ?) left in get_stats() back in #178:

# NOTE: v4.x may produce more than one entry per line. For example:
#   1:  for x in range(10):
#   2:      pass
#  will produce a 1-hit entry on line 1, and 10-hit entries on lines 1 and 2
#  This doesn't affect `print_stats`, because it uses the last entry for a given line (line number is
#  used a dict key so earlier entries are overwritten), but to keep compatibility with other tools,
#  let's only keep the last entry for each line

However, by the time of #214 they seem to have gone obsolete after switching to data aggregation with the dicts nhits_by_lineno and total_time_by_lineno. Notably, this is probably why we always end up with n + 1 hits on the for ...: line (#215).

The latter behavior is preserved by this PR so I've gone ahead and removed the earlier comment, but I wonder if you or @Erotemic have anything to add regarding which behavior is the preferred one...

Erotemic · 2025-06-08T03:22:22Z

The question is if we should count for _ in range(N) as N or N + 1 hits where the +1 is basically representing you have a branch entrypoint? I think there is an argument for either.

Consider:

kernprof -lvr  -c "if 1:
    from time import sleep
    def bar(): 
        1 + 2
    def foo():
        sleep(1); sleep(2)
        1 + 2 + 3; 4 + 5 + 6; bar(); bar(); bar()
    foo()
"

Which gives:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                               def bar(): 
     3         3          0.5      0.2    100.0          1 + 2  

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     5                                               def foo():                                   
     6         1    3000126.1    3e+06    100.0          sleep(1); sleep(2)                       
     7         1         27.2     27.2      0.0          1 + 2 + 3; 4 + 5 + 6; bar(); bar(); bar()

We have 3 hits for the lines in bar, because its called 3 times, but only 1 hit on each line in foo, even though are a multiple expressions on the same line. However, it looks like the time for that line is correctly summed across all expressions on that line.

The notion of a "hit" is a bit ill-defined, but in most cases it will correspond to a single expression on a single line.

I think it makes sense to define a "hit" as the number of times you "enter" a line and then "leave" a line, regardless of how many expressions are on it, and for a loop there will always be an extra time that you enter/exit it because the condition fails and you break the loop. E.g. in the following code the range(0) will still get one hit.

kernprof -lvr  -c "if 1:
    def foo(): 
       for i in range(10):
          ...
       for i in range(0):
          ...    
    foo()
"

to get:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                               def foo():            
     3        11          1.7      0.2     45.6         for i in range(10):
     4        10          1.5      0.2     40.6            ...             
     5         1          0.5      0.5     13.8         for i in range(0): 
     6                                                     ...

Another interesting example is:

kernprof -lvr  -c "if 1:
    def foo(): 
       for i in range(1):
          if i == 0:
              break
       for i in range(1):
           ...
    foo()
"

to get

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                               def foo():           
     3         1          0.8      0.8     30.4         for i in range(1):
     4         1          0.3      0.3     10.6            if i == 0:     
     5         1          0.6      0.6     20.9                break      
     6         2          0.8      0.4     29.1         for i in range(1):
     7         1          0.2      0.2      8.9             ...

where the second loop gets 2 hits, but the first only gets 1, because the first never comes back to the loop to realize the iterable is empty as the body does the job of breaking out of the loop. So I think the +1 behavior is what we want.

Erotemic · 2025-07-08T01:16:38Z

Is it the case that this PR needs merging? Or is the existing behavior effectively correct?

TTsangSC · 2025-07-08T02:57:07Z

The specific behavior (# of line hits in a loop) that we discussed above is correct, but that was more like something that I came across when writing this and not what the PR is about.

The PR itself is more about some data being collected (in .co_code_map) but not showing up in .get_stats(), because they are hashed to the same LineStats.timings key in one way or another. One example is in the linked issue #348, and another that I can think of off the top of my head is for function decorators:

Code example (click to show)

from functools import wraps


def decorator(func):
    @wraps(func)
    def wrapper(*a, **k):
        print('Calling {0.__module__}.{0.__qualname__}({1})'
              .format(func,
                      ', '.join([repr(aa) for aa in a]
                      + [f'{kk}={v!r}' for kk, v in k.items()])))
        try:
            return func(*a, **k)
        finally:
            print(f'Call ended for {func.__module__}.{func.__qualname__}()')

    return wrapper


@profile
@decorator
def sum_of_ints(n):
    x = 0
    for n in range(1, n + 1):
        x += n
    return x


@profile
@decorator
def sum_of_squares(n):
    x = 0
    for n in range(1, n + 1):
        x += n * n
    return x


assert sum_of_ints(7) == 28
assert sum_of_squares(8) == 204
assert sum_of_ints(100) == 5050

print('\n.c_code_map:')
line_12_data = {h: {l: item for l, item in data.items() if l == 12}
        for h, data in profile.c_code_map.items()}
line_12_data = {key: value for key, value in line_12_data.items() if value}
print(line_12_data)

print('\n.get_stats():')
line_12_stats = {key: [(l, *others) for l, *others in values if l == 12]
                 for key, values in profile.get_stats().timings.items()}
line_12_stats = {key: value for key, value in line_12_stats.items() if value}
print(line_12_stats)

Output (click to show)

(py3.13)  $ kernprof -lv test.py
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/line_profiler.py:245: UserWarning: Adding a function with a __wrapped__ attribute. You may want to profile
 the wrapped function by adding sum_of_ints.__wrapped__ instead.
  self.add_function(impl)
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/line_profiler.py:245: UserWarning: Adding a function with a __wrapped__ attribute. You may want to profile
 the wrapped function by adding sum_of_squares.__wrapped__ instead.
  self.add_function(impl)
Calling __main__.sum_of_ints(7)
Call ended for __main__.sum_of_ints()
Calling __main__.sum_of_squares(8)
Call ended for __main__.sum_of_squares()
Calling __main__.sum_of_ints(100)
Call ended for __main__.sum_of_ints()

.c_code_map:
{-7714615230386940376: {12: {'code': -7714615230386940376, 'lineno': 12, 'total_time': 3000, 'nhits': 1}}, -2261749638165304748: {12: {'code': -2261749638165304748, 'lineno': 12,
 'total_time': 23000, 'nhits': 2}}}

.get_stats():
{('test.py', 5, 'decorator.<locals>.wrapper'): [(12, 1, 3000)]}
Wrote profile results to 'test.py.lprof'
Timer unit: 1e-06 s

Total time: 8e-06 s
File: test.py
Function: decorator.<locals>.wrapper at line 5

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     5                                               @wraps(func)
     6                                               def wrapper(*a, **k):
     7         2          2.0      1.0     25.0          print('Calling {0.__module__}.{0.__qualname__}({1})'
     8         2          1.0      0.5     12.5                .format(func,
     9         2          0.0      0.0      0.0                        ', '.join([repr(aa) for aa in a]
    10         1          1.0      1.0     12.5                        + [f'{kk}={v!r}' for kk, v in k.items()])))
    11         1          0.0      0.0      0.0          try:
    12         1          3.0      3.0     37.5              return func(*a, **k)
    13                                                   finally:
    14         1          1.0      1.0     12.5              print(f'Call ended for {func.__module__}.{func.__qualname__}()')

The decorated functions are run thrice in total (as shown in the .c_code_map output), but only 1 or 2 calls are reported in the profiling results (as shown in the .get_stats() output and the profiling output) because both wrappers are hashed to the same key.

This wrong behavior (fixed by the PR) is particularly detrimental to pre-imports, because we don't currently perform function unwrapping (maybe we should add a switch for that in #335 (or later, as you've prudently suggested in a comment)). I'd say that this and #351 are even higher priority than #335 because they have to do with correctness.

TTsangSC · 2025-07-08T02:59:34Z

Will try to rebase this and #351 on main and resolve the conflicts, and get back to you in a hour.

line_profiler/_line_profiler.pyx::LineProfiler.get_stats() Updated/simplified implementation to aggregate profiling data between code objects with duplicate hashes

This was referenced Jun 9, 2025

BUG: dupe-code resolution on multiple profilers results in loss of profiling data #350

Closed

FIX: Update hash tables of affected profiler instances when rewriting code objects #351

Merged

TTsangSC mentioned this pull request Jun 19, 2025

FIX: restore Cython compatibility + pivot to sys.monitoring #352

Merged

TTsangSC added 3 commits July 8, 2025 05:02

Added failing test for issue pyutils#348

1e9491e

Refactored .get_stats() to fix the bug

e5afa63

line_profiler/_line_profiler.pyx::LineProfiler.get_stats() Updated/simplified implementation to aggregate profiling data between code objects with duplicate hashes

CHANGELOG entry

9bcc961

TTsangSC force-pushed the dup-code-fix branch from b4d5d5c to 9bcc961 Compare July 8, 2025 03:07

TTsangSC mentioned this pull request Jul 8, 2025

ENH: read TOML files for configurations #335

Merged

Erotemic merged commit 9bcc961 into pyutils:main Jul 9, 2025
36 checks passed

TTsangSC deleted the dup-code-fix branch July 9, 2025 14:10

TTsangSC mentioned this pull request Jul 9, 2025

Restore trace callback when the profiler is disabled #334

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX: aggregating profiling data from duplicate code objects #349

FIX: aggregating profiling data from duplicate code objects #349

Uh oh!

TTsangSC commented Jun 8, 2025

Uh oh!

codecov bot commented Jun 8, 2025 •

edited

Loading

Uh oh!

TTsangSC commented Jun 8, 2025 •

edited

Loading

Uh oh!

Erotemic commented Jun 8, 2025 •

edited

Loading

Uh oh!

Erotemic commented Jul 8, 2025

Uh oh!

TTsangSC commented Jul 8, 2025

Uh oh!

TTsangSC commented Jul 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FIX: aggregating profiling data from duplicate code objects #349

FIX: aggregating profiling data from duplicate code objects #349

Uh oh!

Conversation

TTsangSC commented Jun 8, 2025

Code changes

Test changes

Uh oh!

codecov bot commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TTsangSC commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Erotemic commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Erotemic commented Jul 8, 2025

Uh oh!

TTsangSC commented Jul 8, 2025

Uh oh!

TTsangSC commented Jul 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jun 8, 2025 •

edited

Loading

TTsangSC commented Jun 8, 2025 •

edited

Loading

Erotemic commented Jun 8, 2025 •

edited

Loading