Skip to content

Conversation

@TTsangSC
Copy link
Collaborator

@TTsangSC TTsangSC commented Jun 8, 2025

Closes #348.

Code changes

  • line_profiler/_line_profiler.pyx::LineProfiler.get_stats():
    Refactored to aggregate profiling data for duplicate (but distinct) code objects, which have separate entries in .c_code_map and .code_hash_map but is hashed to the same key by label() in the output LineStats object

Test changes

  • tests/test_line_profiler.py::test_duplicate_code_objects():
    New test that the bug described in the issue is fixed

@codecov
Copy link

codecov bot commented Jun 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.36%. Comparing base (97ecd5c) to head (9bcc961).
Report is 19 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #349   +/-   ##
=======================================
  Coverage   87.36%   87.36%           
=======================================
  Files          18       18           
  Lines        1630     1630           
  Branches      347      347           
=======================================
  Hits         1424     1424           
  Misses        151      151           
  Partials       55       55           

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ffc0f1...9bcc961. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TTsangSC
Copy link
Collaborator Author

TTsangSC commented Jun 8, 2025

Hi @Theelx, when working on this I noticed that there were comments that you (or your collaborator @ichichkine ?) left in get_stats() back in #178:

# NOTE: v4.x may produce more than one entry per line. For example:
#   1:  for x in range(10):
#   2:      pass
#  will produce a 1-hit entry on line 1, and 10-hit entries on lines 1 and 2
#  This doesn't affect `print_stats`, because it uses the last entry for a given line (line number is
#  used a dict key so earlier entries are overwritten), but to keep compatibility with other tools,
#  let's only keep the last entry for each line 

However, by the time of #214 they seem to have gone obsolete after switching to data aggregation with the dicts nhits_by_lineno and total_time_by_lineno. Notably, this is probably why we always end up with n + 1 hits on the for ...: line (#215).

The latter behavior is preserved by this PR so I've gone ahead and removed the earlier comment, but I wonder if you or @Erotemic have anything to add regarding which behavior is the preferred one...

@Erotemic
Copy link
Member

Erotemic commented Jun 8, 2025

The question is if we should count for _ in range(N) as N or N + 1 hits where the +1 is basically representing you have a branch entrypoint? I think there is an argument for either.

Consider:

kernprof -lvr  -c "if 1:
    from time import sleep
    def bar(): 
        1 + 2
    def foo():
        sleep(1); sleep(2)
        1 + 2 + 3; 4 + 5 + 6; bar(); bar(); bar()
    foo()
"

Which gives:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                               def bar(): 
     3         3          0.5      0.2    100.0          1 + 2  

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     5                                               def foo():                                   
     6         1    3000126.1    3e+06    100.0          sleep(1); sleep(2)                       
     7         1         27.2     27.2      0.0          1 + 2 + 3; 4 + 5 + 6; bar(); bar(); bar()

We have 3 hits for the lines in bar, because its called 3 times, but only 1 hit on each line in foo, even though are a multiple expressions on the same line. However, it looks like the time for that line is correctly summed across all expressions on that line.

The notion of a "hit" is a bit ill-defined, but in most cases it will correspond to a single expression on a single line.

I think it makes sense to define a "hit" as the number of times you "enter" a line and then "leave" a line, regardless of how many expressions are on it, and for a loop there will always be an extra time that you enter/exit it because the condition fails and you break the loop. E.g. in the following code the range(0) will still get one hit.

kernprof -lvr  -c "if 1:
    def foo(): 
       for i in range(10):
          ...
       for i in range(0):
          ...    
    foo()
"

to get:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                               def foo():            
     3        11          1.7      0.2     45.6         for i in range(10):
     4        10          1.5      0.2     40.6            ...             
     5         1          0.5      0.5     13.8         for i in range(0): 
     6                                                     ...             

Another interesting example is:

kernprof -lvr  -c "if 1:
    def foo(): 
       for i in range(1):
          if i == 0:
              break
       for i in range(1):
           ...
    foo()
"

to get

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                               def foo():           
     3         1          0.8      0.8     30.4         for i in range(1):
     4         1          0.3      0.3     10.6            if i == 0:     
     5         1          0.6      0.6     20.9                break      
     6         2          0.8      0.4     29.1         for i in range(1):
     7         1          0.2      0.2      8.9             ...           

where the second loop gets 2 hits, but the first only gets 1, because the first never comes back to the loop to realize the iterable is empty as the body does the job of breaking out of the loop. So I think the +1 behavior is what we want.

@Erotemic
Copy link
Member

Erotemic commented Jul 8, 2025

Is it the case that this PR needs merging? Or is the existing behavior effectively correct?

@TTsangSC
Copy link
Collaborator Author

TTsangSC commented Jul 8, 2025

The specific behavior (# of line hits in a loop) that we discussed above is correct, but that was more like something that I came across when writing this and not what the PR is about.

The PR itself is more about some data being collected (in .co_code_map) but not showing up in .get_stats(), because they are hashed to the same LineStats.timings key in one way or another. One example is in the linked issue #348, and another that I can think of off the top of my head is for function decorators:

Code example (click to show)
from functools import wraps


def decorator(func):
    @wraps(func)
    def wrapper(*a, **k):
        print('Calling {0.__module__}.{0.__qualname__}({1})'
              .format(func,
                      ', '.join([repr(aa) for aa in a]
                      + [f'{kk}={v!r}' for kk, v in k.items()])))
        try:
            return func(*a, **k)
        finally:
            print(f'Call ended for {func.__module__}.{func.__qualname__}()')

    return wrapper


@profile
@decorator
def sum_of_ints(n):
    x = 0
    for n in range(1, n + 1):
        x += n
    return x


@profile
@decorator
def sum_of_squares(n):
    x = 0
    for n in range(1, n + 1):
        x += n * n
    return x


assert sum_of_ints(7) == 28
assert sum_of_squares(8) == 204
assert sum_of_ints(100) == 5050

print('\n.c_code_map:')
line_12_data = {h: {l: item for l, item in data.items() if l == 12}
        for h, data in profile.c_code_map.items()}
line_12_data = {key: value for key, value in line_12_data.items() if value}
print(line_12_data)

print('\n.get_stats():')
line_12_stats = {key: [(l, *others) for l, *others in values if l == 12]
                 for key, values in profile.get_stats().timings.items()}
line_12_stats = {key: value for key, value in line_12_stats.items() if value}
print(line_12_stats)
Output (click to show)
(py3.13)  $ kernprof -lv test.py
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/line_profiler.py:245: UserWarning: Adding a function with a __wrapped__ attribute. You may want to profile
 the wrapped function by adding sum_of_ints.__wrapped__ instead.
  self.add_function(impl)
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/line_profiler.py:245: UserWarning: Adding a function with a __wrapped__ attribute. You may want to profile
 the wrapped function by adding sum_of_squares.__wrapped__ instead.
  self.add_function(impl)
Calling __main__.sum_of_ints(7)
Call ended for __main__.sum_of_ints()
Calling __main__.sum_of_squares(8)
Call ended for __main__.sum_of_squares()
Calling __main__.sum_of_ints(100)
Call ended for __main__.sum_of_ints()

.c_code_map:
{-7714615230386940376: {12: {'code': -7714615230386940376, 'lineno': 12, 'total_time': 3000, 'nhits': 1}}, -2261749638165304748: {12: {'code': -2261749638165304748, 'lineno': 12,
 'total_time': 23000, 'nhits': 2}}}

.get_stats():
{('test.py', 5, 'decorator.<locals>.wrapper'): [(12, 1, 3000)]}
Wrote profile results to 'test.py.lprof'
Timer unit: 1e-06 s

Total time: 8e-06 s
File: test.py
Function: decorator.<locals>.wrapper at line 5

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     5                                               @wraps(func)
     6                                               def wrapper(*a, **k):
     7         2          2.0      1.0     25.0          print('Calling {0.__module__}.{0.__qualname__}({1})'
     8         2          1.0      0.5     12.5                .format(func,
     9         2          0.0      0.0      0.0                        ', '.join([repr(aa) for aa in a]
    10         1          1.0      1.0     12.5                        + [f'{kk}={v!r}' for kk, v in k.items()])))
    11         1          0.0      0.0      0.0          try:
    12         1          3.0      3.0     37.5              return func(*a, **k)
    13                                                   finally:
    14         1          1.0      1.0     12.5              print(f'Call ended for {func.__module__}.{func.__qualname__}()')

The decorated functions are run thrice in total (as shown in the .c_code_map output), but only 1 or 2 calls are reported in the profiling results (as shown in the .get_stats() output and the profiling output) because both wrappers are hashed to the same key.

This wrong behavior (fixed by the PR) is particularly detrimental to pre-imports, because we don't currently perform function unwrapping (maybe we should add a switch for that in #335 (or later, as you've prudently suggested in a comment)). I'd say that this and #351 are even higher priority than #335 because they have to do with correctness.

@TTsangSC
Copy link
Collaborator Author

TTsangSC commented Jul 8, 2025

Will try to rebase this and #351 on main and resolve the conflicts, and get back to you in a hour.

TTsangSC added 3 commits July 8, 2025 05:02
line_profiler/_line_profiler.pyx::LineProfiler.get_stats()
    Updated/simplified implementation to aggregate profiling data
    between code objects with duplicate hashes
@Erotemic Erotemic merged commit 9bcc961 into pyutils:main Jul 9, 2025
36 checks passed
@TTsangSC TTsangSC deleted the dup-code-fix branch July 9, 2025 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Timing entries not combined between duplicate code objects

2 participants