-
Notifications
You must be signed in to change notification settings - Fork 0
do setup only once #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you!Thank you for your pull request 😃 🤖 This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}. If you have files that automatically render output (e.g. R Markdown), then you should check for the following:
Rendered Changes🔍 Inspect the changes: https://github.com/RSE-Sheffield/pando-python/compare/md-outputs..md-outputs-PR-89 The following changes were observed in the rendered markdown documents: What does this mean?If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible. This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation. ⏱️ Updated at 2025-07-09 14:32:50 +0000 |
Robadob
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good improvement, I guess they were written this way due to me not understanding/trusting Python's scoping rules.
Might be worth removing the seeding of random, this feels redundant for the purposes of comparison if they're all using the same dataset (and makes the code a tiny bit smaller).
no longer necessary, now that input data is generated exactly once Co-authored-by: Robert Chisholm <Robadob@Robadob.org>
|
Ah I missed your questions. will double check now. |
|
Thanks for the review; I’ve cleaned up the random seeds as suggested. Before we merge this, do you have any thoughts on the two open questions in the original post? |
I know the speedups can differ quite a bit between my laptop and desktop (and sometimes over repeated runs). I've never bothered to investigate why (e.g. all hardware differences, or are there significant impacts from Python version), as the hierarchy always remains the same. I'd be most concerned that all times for a particular example are self-consistent (e.g. same hardware), rather than trying to standardise across the site. Python 3.10.12/Windows on my office desktop: So it's even slower on my run on this particular machine (no idea where I first ran it). (This took longer to run than I recall, I'd have probably reduced repeats to make it practical for people to mess around with :|)
It might just be worth adding a note in italics: "Different versions of python may produce slightly different bytecode." Feels worth standardising those code examples to match though. Could argue against, as it maybe helps make the point of called functions not being disassembled. |
Full results, everything a bit slower |
|
Okay, that’s all about a factor of 2 longer than in my testing, which is within the typical amount of variation I’d expect. It looks like the order of magnitude difference for |
|
Sounds good |
|
I’m finally finding time to finish this now, after the usual summer and conference season. Since the main repo has since moved to the carpentries-incubator org, I’ll close the PR here and open a new one over there. |

Previously, some code examples would run the setup step as part of the timed function, repeating it thousands of times. In particular, for the two examples in
episodes/optimisation-data-structures-algorithms.mdthe setup step took about 6 ms (an order of magnitude longer than the faster methods).Running the setup just once shaves about 20 s of the runtime of these examples. It also makes the timing cleaner (because it is no longer necessary to substract the setup time) and potentially more reliable.
There are two open questions before we merge this:
search_set: 0.6 ms,linear_search_list: 1500 msandbinary_search_list: 3.36ms. The latter are about 1.5× faster than described in the text, which is reasonable; but the first one is an order of magnitude slower. Could you double check, please? (I’m wondering if substracting the setup time might have affected the result there …)manualSearchandoperatorSearchexamples, so should probably be updated alongside these changes. However, I noticed that Python 3.13 produces a few different bytecodes, even for the code that remained unchanged. I’m wondering how we should deal with this; especially w/r/t Python Version Upgrade Process/Documentation #75.