-
Notifications
You must be signed in to change notification settings - Fork 133
FIX: aggregating profiling data from duplicate code objects #349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #349 +/- ##
=======================================
Coverage 87.36% 87.36%
=======================================
Files 18 18
Lines 1630 1630
Branches 347 347
=======================================
Hits 1424 1424
Misses 151 151
Partials 55 55 Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
Hi @Theelx, when working on this I noticed that there were comments that you (or your collaborator @ichichkine ?) left in However, by the time of #214 they seem to have gone obsolete after switching to data aggregation with the dicts The latter behavior is preserved by this PR so I've gone ahead and removed the earlier comment, but I wonder if you or @Erotemic have anything to add regarding which behavior is the preferred one... |
|
The question is if we should count Consider: kernprof -lvr -c "if 1:
from time import sleep
def bar():
1 + 2
def foo():
sleep(1); sleep(2)
1 + 2 + 3; 4 + 5 + 6; bar(); bar(); bar()
foo()
"Which gives: We have 3 hits for the lines in bar, because its called 3 times, but only 1 hit on each line in foo, even though are a multiple expressions on the same line. However, it looks like the time for that line is correctly summed across all expressions on that line. The notion of a "hit" is a bit ill-defined, but in most cases it will correspond to a single expression on a single line. I think it makes sense to define a "hit" as the number of times you "enter" a line and then "leave" a line, regardless of how many expressions are on it, and for a loop there will always be an extra time that you enter/exit it because the condition fails and you break the loop. E.g. in the following code the range(0) will still get one hit. kernprof -lvr -c "if 1:
def foo():
for i in range(10):
...
for i in range(0):
...
foo()
"to get: Another interesting example is: kernprof -lvr -c "if 1:
def foo():
for i in range(1):
if i == 0:
break
for i in range(1):
...
foo()
"to get where the second loop gets 2 hits, but the first only gets 1, because the first never comes back to the loop to realize the iterable is empty as the body does the job of breaking out of the loop. So I think the +1 behavior is what we want. |
|
Is it the case that this PR needs merging? Or is the existing behavior effectively correct? |
|
The specific behavior (# of line hits in a loop) that we discussed above is correct, but that was more like something that I came across when writing this and not what the PR is about. The PR itself is more about some data being collected (in Code example (click to show)from functools import wraps
def decorator(func):
@wraps(func)
def wrapper(*a, **k):
print('Calling {0.__module__}.{0.__qualname__}({1})'
.format(func,
', '.join([repr(aa) for aa in a]
+ [f'{kk}={v!r}' for kk, v in k.items()])))
try:
return func(*a, **k)
finally:
print(f'Call ended for {func.__module__}.{func.__qualname__}()')
return wrapper
@profile
@decorator
def sum_of_ints(n):
x = 0
for n in range(1, n + 1):
x += n
return x
@profile
@decorator
def sum_of_squares(n):
x = 0
for n in range(1, n + 1):
x += n * n
return x
assert sum_of_ints(7) == 28
assert sum_of_squares(8) == 204
assert sum_of_ints(100) == 5050
print('\n.c_code_map:')
line_12_data = {h: {l: item for l, item in data.items() if l == 12}
for h, data in profile.c_code_map.items()}
line_12_data = {key: value for key, value in line_12_data.items() if value}
print(line_12_data)
print('\n.get_stats():')
line_12_stats = {key: [(l, *others) for l, *others in values if l == 12]
for key, values in profile.get_stats().timings.items()}
line_12_stats = {key: value for key, value in line_12_stats.items() if value}
print(line_12_stats)Output (click to show)(py3.13) $ kernprof -lv test.py
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/line_profiler.py:245: UserWarning: Adding a function with a __wrapped__ attribute. You may want to profile
the wrapped function by adding sum_of_ints.__wrapped__ instead.
self.add_function(impl)
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/line_profiler.py:245: UserWarning: Adding a function with a __wrapped__ attribute. You may want to profile
the wrapped function by adding sum_of_squares.__wrapped__ instead.
self.add_function(impl)
Calling __main__.sum_of_ints(7)
Call ended for __main__.sum_of_ints()
Calling __main__.sum_of_squares(8)
Call ended for __main__.sum_of_squares()
Calling __main__.sum_of_ints(100)
Call ended for __main__.sum_of_ints()
.c_code_map:
{-7714615230386940376: {12: {'code': -7714615230386940376, 'lineno': 12, 'total_time': 3000, 'nhits': 1}}, -2261749638165304748: {12: {'code': -2261749638165304748, 'lineno': 12,
'total_time': 23000, 'nhits': 2}}}
.get_stats():
{('test.py', 5, 'decorator.<locals>.wrapper'): [(12, 1, 3000)]}
Wrote profile results to 'test.py.lprof'
Timer unit: 1e-06 s
Total time: 8e-06 s
File: test.py
Function: decorator.<locals>.wrapper at line 5
Line # Hits Time Per Hit % Time Line Contents
==============================================================
5 @wraps(func)
6 def wrapper(*a, **k):
7 2 2.0 1.0 25.0 print('Calling {0.__module__}.{0.__qualname__}({1})'
8 2 1.0 0.5 12.5 .format(func,
9 2 0.0 0.0 0.0 ', '.join([repr(aa) for aa in a]
10 1 1.0 1.0 12.5 + [f'{kk}={v!r}' for kk, v in k.items()])))
11 1 0.0 0.0 0.0 try:
12 1 3.0 3.0 37.5 return func(*a, **k)
13 finally:
14 1 1.0 1.0 12.5 print(f'Call ended for {func.__module__}.{func.__qualname__}()')
The decorated functions are run thrice in total (as shown in the This wrong behavior (fixed by the PR) is particularly detrimental to pre-imports, because we don't currently perform function unwrapping (maybe we should add a switch for that in #335 (or later, as you've prudently suggested in a comment)). I'd say that this and #351 are even higher priority than #335 because they have to do with correctness. |
|
Will try to rebase this and #351 on |
line_profiler/_line_profiler.pyx::LineProfiler.get_stats()
Updated/simplified implementation to aggregate profiling data
between code objects with duplicate hashes
Closes #348.
Code changes
line_profiler/_line_profiler.pyx::LineProfiler.get_stats():Refactored to aggregate profiling data for duplicate (but distinct) code objects, which have separate entries in
.c_code_mapand.code_hash_mapbut is hashed to the same key bylabel()in the outputLineStatsobjectTest changes
tests/test_line_profiler.py::test_duplicate_code_objects():New test that the bug described in the issue is fixed