do setup only once #89

JostMigenda · 2025-07-07T11:07:53Z

Previously, some code examples would run the setup step as part of the timed function, repeating it thousands of times. In particular, for the two examples in episodes/optimisation-data-structures-algorithms.md the setup step took about 6 ms (an order of magnitude longer than the faster methods).
Running the setup just once shaves about 20 s of the runtime of these examples. It also makes the timing cleaner (because it is no longer necessary to substract the setup time) and potentially more reliable.

There are two open questions before we merge this:

For the search example, I’m getting search_set: 0.6 ms, linear_search_list: 1500 ms and binary_search_list: 3.36ms. The latter are about 1.5× faster than described in the text, which is reasonable; but the first one is an order of magnitude slower. Could you double check, please? (I’m wondering if substracting the setup time might have affected the result there …)
The technical appendix on bytecode uses the manualSearch and operatorSearch examples, so should probably be updated alongside these changes. However, I noticed that Python 3.13 produces a few different bytecodes, even for the code that remained unchanged. I’m wondering how we should deal with this; especially w/r/t Python Version Upgrade Process/Documentation #75.

github-actions · 2025-07-07T11:08:02Z

Thank you!

Thank you for your pull request 😃

🤖 This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}.

If you have files that automatically render output (e.g. R Markdown), then you should check for the following:

🎯 correct output
🖼️ correct figures
❓ new warnings
‼️ new errors

Rendered Changes

🔍 Inspect the changes: https://github.com/RSE-Sheffield/pando-python/compare/md-outputs..md-outputs-PR-89

The following changes were observed in the rendered markdown documents:

 md5sum.txt                                 |  4 +--
 optimisation-data-structures-algorithms.md | 50 +++++++++++-------------------
 optimisation-using-python.md               | 13 +++-----
 3 files changed, 24 insertions(+), 43 deletions(-)

What does this mean?

If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible.

This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation.

⏱️ Updated at 2025-07-09 14:32:50 +0000

Robadob

Good improvement, I guess they were written this way due to me not understanding/trusting Python's scoping rules.

Might be worth removing the seeding of random, this feels redundant for the purposes of comparison if they're all using the same dataset (and makes the code a tiny bit smaller).

episodes/optimisation-data-structures-algorithms.md

episodes/optimisation-using-python.md

Robadob · 2025-07-09T13:38:27Z

Off-topic:

This warning (visible at the bottom of your PRs diff) seems spurious, the anchor works in the live site. Possibly a bug due to the original link containing an apostrophe.

no longer necessary, now that input data is generated exactly once Co-authored-by: Robert Chisholm <Robadob@Robadob.org>

Robadob · 2025-07-09T14:33:13Z

Ah I missed your questions. will double check now.

JostMigenda · 2025-07-09T14:33:40Z

Thanks for the review; I’ve cleaned up the random seeds as suggested.

Before we merge this, do you have any thoughts on the two open questions in the original post?

Robadob · 2025-07-09T15:17:45Z

For the search example, I’m getting search_set: 0.6 ms, linear_search_list: 1500 ms and binary_search_list: 3.36ms. The latter are about 1.5× faster than described in the text, which is reasonable; but the first one is an order of magnitude slower. Could you double check, please? (I’m wondering if substracting the setup time might have affected the result there …)

I know the speedups can differ quite a bit between my laptop and desktop (and sometimes over repeated runs). I've never bothered to investigate why (e.g. all hardware differences, or are there significant impacts from Python version), as the hierarchy always remains the same. I'd be most concerned that all times for a particular example are self-consistent (e.g. same hardware), rather than trying to standardise across the site.

Python 3.10.12/Windows on my office desktop:

search_set: 1.13ms
# Still waiting for the linear search to complete, but that's being very slow, i'll catch it later

So it's even slower on my run on this particular machine (no idea where I first ran it).

(This took longer to run than I recall, I'd have probably reduced repeats to make it practical for people to mess around with :|)

The technical appendix on bytecode uses the manualSearch and operatorSearch examples, so should probably be updated alongside these changes. However, I noticed that Python 3.13 produces a few different bytecodes, even for the code that remained unchanged. I’m wondering how we should deal with this; especially w/r/t #75.

It might just be worth adding a note in italics: "Different versions of python may produce slightly different bytecode."

Feels worth standardising those code examples to match though. Could argue against, as it maybe helps make the point of called functions not being disassembled.

Robadob · 2025-07-09T15:35:49Z

search_set: 1.13ms
linear_search_list: 2735.54ms
binary_search_list: 7.69ms

Full results, everything a bit slower

JostMigenda · 2025-07-09T15:57:14Z

Okay, that’s all about a factor of 2 longer than in my testing, which is within the typical amount of variation I’d expect. It looks like the order of magnitude difference for search_set is a clear aberration then, so I should update that.

Robadob · 2025-07-09T15:58:22Z

Sounds good

JostMigenda · 2025-10-30T13:08:31Z

I’m finally finding time to finish this now, after the usual summer and conference season. Since the main repo has since moved to the carpentries-incubator org, I’ll close the PR here and open a new one over there.

do setup only once

5e6cb1f

JostMigenda requested a review from Robadob July 7, 2025 11:07

github-actions bot pushed a commit that referenced this pull request Jul 7, 2025

differences for PR #89

f4827ed

Robadob approved these changes Jul 9, 2025

View reviewed changes

Robadob reviewed Jul 9, 2025

View reviewed changes

episodes/optimisation-data-structures-algorithms.md Outdated Show resolved Hide resolved

Robadob reviewed Jul 9, 2025

View reviewed changes

episodes/optimisation-data-structures-algorithms.md Outdated Show resolved Hide resolved

Robadob reviewed Jul 9, 2025

View reviewed changes

episodes/optimisation-using-python.md Outdated Show resolved Hide resolved

Robadob reviewed Jul 9, 2025

View reviewed changes

episodes/optimisation-using-python.md Show resolved Hide resolved

remove pRNG seeds

4df35e5

no longer necessary, now that input data is generated exactly once Co-authored-by: Robert Chisholm <Robadob@Robadob.org>

github-actions bot pushed a commit that referenced this pull request Jul 9, 2025

differences for PR #89

f20f76e

JostMigenda closed this Oct 30, 2025

JostMigenda mentioned this pull request Oct 30, 2025

Remove duplicate setup code in examples carpentries-incubator/pando-python#14

Merged

do setup only once #89

do setup only once #89

Uh oh!

Conversation

JostMigenda commented Jul 7, 2025

Uh oh!

github-actions bot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Thank you!

Rendered Changes

Uh oh!

Robadob left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Robadob commented Jul 9, 2025

Uh oh!

Robadob commented Jul 9, 2025

Uh oh!

JostMigenda commented Jul 9, 2025

Uh oh!

Robadob commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Robadob commented Jul 9, 2025

Uh oh!

JostMigenda commented Jul 9, 2025

Uh oh!

Robadob commented Jul 9, 2025

Uh oh!

JostMigenda commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Jul 7, 2025 •

edited

Loading

Robadob commented Jul 9, 2025 •

edited

Loading