diff --git a/config.yaml b/config.yaml index bdf9c034e..5764fa3c6 100644 --- a/config.yaml +++ b/config.yaml @@ -82,7 +82,8 @@ episodes: - 22-scaling-up-unit-testing.md - 23-continuous-integration-automated-testing.md - 24-diagnosing-issues-improving-robustness.md -- 25-section2-optional-exercises.md +- 25-type-annotation.md +- 26-section2-optional-exercises.md - 30-section3-intro.md - 31-software-requirements.md - 32-software-architecture-design.md diff --git a/episodes/12-virtual-environments.md b/episodes/12-virtual-environments.md index 5e9e7354f..d641a6045 100644 --- a/episodes/12-virtual-environments.md +++ b/episodes/12-virtual-environments.md @@ -135,15 +135,16 @@ There are several commonly used command line tools for managing Python virtual e - `conda`, package and environment management system (also included as part of the Anaconda Python distribution often used by the scientific community) - `poetry`, a modern Python packaging tool which handles virtual environments automatically +- `uv`, a recent and fast package, environment, and project manager. While there are pros and cons for using each of the above, all will do the job of managing Python virtual environments for you and it may be a matter of personal preference which one you go for. -In this course, we will use `venv` to create and manage our virtual environment -(which is the preferred way for Python 3.3+). -The upside is that `venv` virtual environments created from the command line are -also recognised and picked up automatically by the IDEs we will use in this course, -as we will see in the next episode. +In this course, we will use `venv` to create and manage our virtual environment, which +is the tool that is part of the standard Python installation and the recommended manager +for Python 3.5+. If you want to experiment with a fast alternative that can work +as a drop-in replacement for `venv`, have a look at `uv` (see also the +[callout below](#uv-callout)). ### Managing External Packages @@ -170,6 +171,22 @@ from remote package repositories and install them on your system, and So, you can use `conda` for both tasks instead of using `venv` and `pip`. +:::::::::::::::::::::::::::::::::::::::::::::::::: + + + +::::::::::::::::::::::::::::::::::::::::: callout + +## A recent addition: `uv` + +[`uv`](https://docs.astral.sh/uv/) is becoming increasingly popular in the world of +Python package and project managers. Similarly to `conda`, it combines the functionality +of a package manager (finding and installing Python packages) and a virtual environment +manager (isolating project dependecies), but it can also be used to create and manage +Python projects, update dependencies, publishing packages, and much more. Its sub-commands +`uv venv` and `uv pip` can be used as a drop-in replacement for `venv` and `pip`, +respectively, which the additional advantage of being much faster! + :::::::::::::::::::::::::::::::::::::::::::::::::: ### Many Tools for the Job @@ -209,7 +226,7 @@ You can test your Python installation from the command line with: ```bash $ python3 --version # on Mac/Linux -$ python --version # on Windows — Windows installation comes with a python.exe file rather than a python3.exe file +$ python --version # on Windows — Windows installation comes with a python.exe file rather than a python3.exe file ``` If you are using Windows and invoking `python` command causes your Git Bash terminal to hang with no error message or output, you may @@ -312,7 +329,7 @@ Here are some references for each of the naming conventions: - [The Python Documentation](https://docs.python.org/3/library/venv.html) indicates that ".venv" is common - ["venv" vs ".venv" discussion](https://discuss.python.org/t/trying-to-come-up-with-a-default-directory-name-for-virtual-environments/3750) - + :::::::::::::::::::::::::::::::::::::::::::::::::: @@ -449,7 +466,7 @@ Version: 1.26.2 Summary: Fundamental package for array computing in Python Home-page: https://numpy.org Author: Travis E. Oliphant et al. -Author-email: +Author-email: License: Copyright (c) 2005-2023, NumPy Developers. All rights reserved. ... @@ -639,7 +656,7 @@ In the above command, we tell the command line two things: As we can see, the Python interpreter ran our script, which threw an error - `inflammation-analysis.py: error: the following arguments are required: infiles`. It looks like the script expects a list of input files to process, -so this is expected behaviour since we do not supply any. +so this is expected behaviour since we do not supply any. We should run our code as follows, passing one (or more) data file(s) as input: diff --git a/episodes/13-ides.md b/episodes/13-ides.md index 80403bb8c..ff7d7fd1c 100644 --- a/episodes/13-ides.md +++ b/episodes/13-ides.md @@ -71,22 +71,22 @@ and some can also interact with a version control system. Compared to an IDE, a good dedicated code editor is usually smaller and quicker, but often less feature-rich. -You will have to decide which one is the best for you in your daily work. In this course you have -the choice of using two free and open source IDEs - +You will have to decide which one is the best for you in your daily work. In this course you have +the choice of using two free and open source IDEs - [PyCharm Community Edition from JetBrains](https://www.jetbrains.com/pycharm/) or [Microsoft's Visual Studio Code (VS Code)](https://code.visualstudio.com/). -A popular alternative to consider is free and open source [Spyder IDE](https://www.spyder-ide.org/) - +A popular alternative to consider is free and open source [Spyder IDE](https://www.spyder-ide.org/) - we are not covering it here but it should be possible to switch. ### Starting With a Software Project -::::::::::::::::::::::::::::::::: group-tab +::::::::::::::::::::::::::::::::: group-tab ### PyCharm When you start PyCharm - you may be presented with a dialog box that asks you what you want to do, e.g. `Create New Project`, `Open`, or `Check out from Version Control`. If that is the case - select `Open` and find the software project directory -`python-intermediate-inflammation` you cloned earlier. Alternatively, do the same from the +`python-intermediate-inflammation` you cloned earlier. Alternatively, do the same from the `File -> Open...` top menu. This directory is now the current working directory for PyCharm, @@ -116,14 +116,14 @@ before you can do any work. You may take the shortcut and click on one of the offered options above but we want to take you through the whole process of setting up your environment in PyCharm shortly -as this is important conceptually. If you do not see this warning - do not worry, it just means +as this is important conceptually. If you do not see this warning - do not worry, it just means you may have configured this already on previous usages of PyCharm. ### VS Code We will take you through the various steps of using VC Code now - also check out the [VS Code Python Quick Start guide](https://code.visualstudio.com/docs/python/python-quick-start) and [Getting Started with Python in VS Code Tutorial](https://code.visualstudio.com/docs/python/python-tutorial). -When you start VS Code, you may be presented with a "Welcome" page giving you shortcuts to commonly +When you start VS Code, you may be presented with a "Welcome" page giving you shortcuts to commonly used actions - e.g. `Open File...' and 'Open Folder...` (on Windows and Linux; with similarly named `New File...` and `Open...` actions on Mac OS), `Clone Git repository...`, etc. ![](fig/vscode-welcome.png){alt='Welcome screen in VS Code' .image-with-shadow width="1000px" } @@ -134,42 +134,42 @@ If that is the case - select `Open Folder...` and find the software project dire ![](fig/vscode-open-project.png){alt='View of an opened project in VS Code' .image-with-shadow width="1000px" } -You’ll see some icons on the left side, which give you access to key views of VS Code. +You’ll see some icons on the left side, which give you access to key views of VS Code. Hovering your mouse over each one will show a tooltip that names that view: - Explorer - file navigator to view existing folders containing project files. - Search - search capability enabling you to search for things in your project (and replace them with other text). -- Source Control - this gives you access to source code control for your project, which includes Git version control functionality. +- Source Control - this gives you access to source code control for your project, which includes Git version control functionality. This feature means you can do things like clone Git repositories (for example, from GitHub), add and commit files to a repository, things like that. - Run and Debug - to run programs you write in a special way with a debugger, which allows you to check the state of your program as it is running, which is very useful and we’ll look into later. - Extensions - which we’ll look into right now, to install extensions to VSCode to extend its functionality in some way. - Testing - testing features for test discovery, test coverage, and running and debugging tests your code. - -VS Code is a lightweight, general-purpose code editor designed to support a wide range of programming -languages and development tasks. -Its core "light" functionality is extended through a rich marketplace of -extensions, allowing users to add language support, debugging tools, linters, formatters, and more. -With extensions, VS Code can seamlessly handle languages (like Python, JavaScript, C++, Java, R, etc.), -data formats (like JSON, YAML, CSV, etc.), and so on, making it a flexible choice for developers working across + +VS Code is a lightweight, general-purpose code editor designed to support a wide range of programming +languages and development tasks. +Its core "light" functionality is extended through a rich marketplace of +extensions, allowing users to add language support, debugging tools, linters, formatters, and more. +With extensions, VS Code can seamlessly handle languages (like Python, JavaScript, C++, Java, R, etc.), +data formats (like JSON, YAML, CSV, etc.), and so on, making it a flexible choice for developers working across multiple technologies. -This means that VS Code will not support Python our of the box - it needs to be extended for Python development +This means that VS Code will not support Python our of the box - it needs to be extended for Python development by installing extensions. You would need the following for this course: -- the official [Python extension by Microsoft](https://marketplace.visualstudio.com/items?itemName=ms-python.python) provides essential features -such as syntax highlighting, IntelliSense (code completion), linting, debugging, and unit testing support. +- the official [Python extension by Microsoft](https://marketplace.visualstudio.com/items?itemName=ms-python.python) provides essential features +such as syntax highlighting, IntelliSense (code completion), linting, debugging, and unit testing support. - [Pylance extension](https://marketplace.visualstudio.com/items?itemName=ms-python.vscode-pylance), now integrated in Python extension by Microsoft, -enhances performance and offers advanced type checking and code navigation. +enhances performance and offers advanced type checking and code navigation. - [autoDocstring](https://marketplace.visualstudio.com/items?itemName=njpwerner.autodocstring) for Python docstring generation. Developers can also integrate other extensions such as , -[Black](https://marketplace.visualstudio.com/items?itemName=ms-python.black-formatter) or [autopep8](https://marketplace.visualstudio.com/items?itemName=ms-python.autopep8) -for automatic code formatting, and many more. -These extensions transform VS Code into a versatile and efficient Python development environment that +[Black](https://marketplace.visualstudio.com/items?itemName=ms-python.black-formatter) or [autopep8](https://marketplace.visualstudio.com/items?itemName=ms-python.autopep8) +for automatic code formatting, and many more. +These extensions transform VS Code into a versatile and efficient Python development environment that suits everything from quick scripts to complex software projects. -We will install and make use of the **Python by Microsoft** and the **autoDocstring** extensions. -You can do that from the Extensions tab (one of the vertical tabs to the left) by searching the +We will install and make use of the **Python by Microsoft** and the **autoDocstring** extensions. +You can do that from the Extensions tab (one of the vertical tabs to the left) by searching the extensions marketplace and installing the ones you need. ![](fig/vscode-extensions.png){alt='VS Code Extensions marketplace for searching and installing extensions' .image-with-shadow width="1000px" } @@ -194,26 +194,29 @@ from the command line already and some IDEs are clever enough to understand it. ### PyCharm -While PyCharm will recognise the virtual environment you already have, and the Python interpreter contained +While PyCharm may recognise the virtual environment you already have, and the Python interpreter contained in it, it is a good practice to tell your IDE which Python interpreter you want to use for which project. This is because you may have multiple Python versions installed on your system and also because you may not have set a virtual environment from command line so you should do if from the IDE instead. 1. Select either `PyCharm -> Settings` (Mac) or `File -> Settings` (Linux, Windows). -2. In the window that appears, - select `Project: python-intermediate-inflammation -> Python Interpreter` from the left. +2. In the window that appears, select `Python -> Interpreter` + (or `Project: python-intermediate-inflammation -> Python Interpreter` in some PyCharm versions) + from the left. You'll see a number of Python packages displayed as a list, and importantly above that, the current Python interpreter that is being used. These may be blank or set to ``, or possibly the default version of Python installed on your system, e.g. `Python 2.7 /usr/bin/python2.7`, which we do not want to use in this instance. -3. Select the cog-like button in the top right, then `Add...` - (or `Add Local...` depending on your PyCharm version). +3. Select `Add interpreter` in the top right, then `Add local interpreter ...` + (depending on your PyCharm version, you might find instead a cog-like button, + with the `Add...` or `Add Local...` sub-options). An `Add Python Interpreter` window will appear. -4. Select `Virtualenv Environment` from the list on the left - and ensure that `Existing environment` checkbox is selected within the popup window. - In the `Interpreter` field point to the Python 3 executable inside +4. In some versions of PyCharm, you will have to select `Virtualenv Environment` from the list + on the left. Ensure that the `Select Existing` (or `Existing environment`) checkbox is selected + within the popup window. If you have a `Type` menu, select `Python`. + In the `Interpreter` (or `Python path`) field point to the Python 3 executable inside your virtual environment's `bin` directory (make sure you navigate to it and select it from the file browser rather than just accept the default offered by PyCharm). @@ -221,7 +224,7 @@ set a virtual environment from command line so you should do if from the IDE ins but we are not using that option as we want to reuse the one we created from the command line in the previous episode. ![](fig/pycharm-configuring-interpreter.png){alt='Configuring Python Interpreter in PyCharm' .image-with-shadow width="800px"} -5. Select `Make available to all projects` checkbox +5. If present, select the `Make available to all projects` checkbox so we can also use this environment for other projects if we wish. 6. Select `OK` in the `Add Python Interpreter` window. Back in the `Preferences` window, you should select "Python 3.11 (python-intermediate-inflammation)" @@ -249,9 +252,9 @@ To do so, from the top menu select `Terminal > New Terminal` to open a new comma source ./venv/bin/activate ``` -Technically, this should set the Python interpreter to be the one contained in your virtual environment. -Still, it is a good idea to check and set the Python interpreter manually in VS Code to make sure things -are configured correctly for your project. +Technically, this should set the Python interpreter to be the one contained in your virtual environment. +Still, it is a good idea to check and set the Python interpreter manually in VS Code to make sure things +are configured correctly for your project. You can do that as follows: @@ -287,8 +290,9 @@ an alternative way of doing this and how it propagates to the command line. ### PyCharm 1. Select either `PyCharm -> Settings` (Mac) or `File -> Settings` (Linux, Windows). -2. In the preferences window that appears, - select `Project: python-intermediate-inflammation -> Project Interpreter` from the left. +2. In the preferences window that appears, select `Python -> Interpreter` + (or `Project: python-intermediate-inflammation`, depending on the version of PyCharm), + from the left. 3. Select the `+` icon at the top of the window. In the window that appears, search for the name of the library (`pytest`), select it from the list, @@ -307,7 +311,7 @@ Let us do this as an exercise. ### VS Code -In VS Code, there is no special graphical user interface to add external dependencies for a project - +In VS Code, there is no special graphical user interface to add external dependencies for a project - this is done from the terminal window as we did before (within the active virtual environment): ```bash @@ -456,7 +460,7 @@ you can have separate configurations for running, debugging and testing your cod ### VS Code -You can configure Visual Studio Code through a number of [settings](https://code.visualstudio.com/docs/configure/settings). +You can configure Visual Studio Code through a number of [settings](https://code.visualstudio.com/docs/configure/settings). Nearly every part of VS Code's editor, user interface, and functional behavior has options you can modify. VS Code provides different scopes for settings: @@ -471,7 +475,7 @@ Depending on your platform, the user settings file is located in: - `$HOME/Library/Application\ Support/Code/User/settings.json` on macOS - `$HOME/.config/Code/User/settings.json` on Linux -The workspace settings file `settings.json` is located under the `.vscode` folder in your project's root folder, and overrides the user settings. +The workspace settings file `settings.json` is located under the `.vscode` folder in your project's root folder, and overrides the user settings. You can access and change user/workspace settings values in a few ways: @@ -481,18 +485,18 @@ You can access and change user/workspace settings values in a few ways: ![](fig/vscode-settings.png){alt='Settings editor in VS Code' .image-with-shadow width="1000px" } -We have already configured Python interpreter via the **Command Palette**. -Other Python commands available through the Python extensions can be accessed through the **Command Palette** in a similar way - +We have already configured Python interpreter via the **Command Palette**. +Other Python commands available through the Python extensions can be accessed through the **Command Palette** in a similar way - bring up the **Command Palette** and enter "Python: " to find them. For simple applications or debugging scenarios, you can run and debug a program without specific debugging configurations. -However, for some more complex run or debugging scenarios you need to create a [**launch configuration**](https://code.visualstudio.com/docs/debugtest/debugging-configuration) - to specify the application entry point or set environment variables. +However, for some more complex run or debugging scenarios you need to create a [**launch configuration**](https://code.visualstudio.com/docs/debugtest/debugging-configuration) - to specify the application entry point or set environment variables. Creating a launch configuration file is also beneficial because it allows you to configure and save debugging setup details with your project. VS Code stores such configuration information in a `launch.json` file located in the `.vscode` folder in your workspace (project root folder), or in your user settings or workspace settings. VS Code also supports compound launch configurations for starting multiple configurations at the same time - e.g. for more complex testing and debugging scenarios. -We do not have anything to put in the launch configuration for the time being, so you do not need to create and configure `launch.json` file. +We do not have anything to put in the launch configuration for the time being, so you do not need to create and configure `launch.json` file. However, it may be useful to know where such information is configured, should you need to do so in the future. To create an initial `launch.json` file - you can go to the Run and Debug view, then click `Create a launch.json file` and follow the instructions. @@ -501,11 +505,11 @@ To create an initial `launch.json` file - you can go to the Run and Debug view, ::::::::::::::::::::::::::::::::: Now you know how to configure and manipulate your environment in both tools -(command line and IDE), which is a useful parallel to be aware of. -As you may have noticed, using the command line terminal facility integrated into an IDE allows you -to run commands (e.g. to manipulate files, interact with version control, etc.) and execute code -without leaving the development environment, making it easier and faster to work by having all -essential tools in one window. +(command line and IDE), which is a useful parallel to be aware of. +As you may have noticed, using the command line terminal facility integrated into an IDE allows you +to run commands (e.g. to manipulate files, interact with version control, etc.) and execute code +without leaving the development environment, making it easier and faster to work by having all +essential tools in one window. Let us have a look at some other features afforded to us by IDEs. @@ -592,28 +596,28 @@ For a selected piece of code, you can access various code reference information #### Code Search You can search for (and replace) a text string within a project, use different scopes to narrow your search process, -use regular expressions for complex searches, include/exclude certain files from your search, +use regular expressions for complex searches, include/exclude certain files from your search, find usages/references and occurrences. ::::::::::::::::::::::::::::::::: group-tab ### PyCharm -To find a search string in the whole project - from the main menu, select `Edit | Find | Find in Path ...` (or `Edit | Find | Find in Files...` depending on your version of PyCharm). +To find a search string in the whole project - from the main menu, select `Edit | Find | Find in Path ...` (or `Edit | Find | Find in Files...` depending on your version of PyCharm). -Type your search string in the search field of the popup. +Type your search string in the search field of the popup. Alternatively, in the editor, highlight the string you want to find and press `CMD-SHIFT-F` (Mac) or `CTRL-SHIFT-F` (Windows). PyCharm places the highlighted string into the search field of the popup. - + ![](fig/pycharm-code-search.png){alt='Code Search Functionality in PyCharm' .image-with-shadow width="800px" } -If you need, specify the additional options in the popup. +If you need, specify the additional options in the popup. PyCharm will list the search strings and all the files that contain them. Check the results in the preview area of the dialog where you can replace the search string or select another string, or press `CMD-SHIFT-F` (Mac) or `CTRL-SHIFT-F` (Windows) again to start a new search. To see the list of occurrences in a separate panel, click the `Open in Find Window` button in the bottom right corner. The find panel will appear at the bottom of the main window; use this panel and its options to group the results, preview them, and work with them further. - + ![](fig/pycharm-find-panel.png){alt='Code Search Functionality in PyCharm' .image-with-shadow width="1000px" } ### VS Code @@ -627,7 +631,7 @@ In the search window that pops up - type in your search string in the search fie ![](fig/vscode-search-window.png){alt='Search for a string functionality in VS Code' .image-with-shadow width="800px" } -The results will show in the search window - you can further filter the results by matching case, +The results will show in the search window - you can further filter the results by matching case, matching the whole word or use regular expressions for more advanced filtering. ![](fig/vscode-search-results.png){alt='Search results window in VS Code with further filtering functionalities' .image-with-shadow width="1000px" } @@ -665,7 +669,7 @@ You can get a full [documentation on PyCharm's built-in version control support] ### VS Code -VS Code has integrated source control management (SCM) and includes Git support out-of-the-box. +VS Code has integrated source control management (SCM) and includes Git support out-of-the-box. Many other source control tools are available through [extensions](https://marketplace.visualstudio.com/search?target=VSCode&category=SCM%20Providers&sortBy=Installs). Our project was already under Git version control and VS Code recognised it. @@ -693,9 +697,9 @@ This functionality in VS Code is available from the Source Control view (from th We have configured our environment and explored some of the most commonly used IDE features and are now ready to run our Python script from the IDE. Running code using the graphical interface of an IDE provides a simple, user-friendly way to execute programs with just a click, reducing the need type commands manually in the command line terminal. -On the other hand, running code from a terminal window in an IDE offers the flexibility and control of the command line — both approaches complement each other by supporting different user preferences and tasks within the same unified environment. +On the other hand, running code from a terminal window in an IDE offers the flexibility and control of the command line — both approaches complement each other by supporting different user preferences and tasks within the same unified environment. -In this lesson, we prioritise using the command line and typing commands whenever possible, as these skills are easily transferable across different IDEs (with a note that you should feel free to use other equivalent ways for doing things that suit you more). +In this lesson, we prioritise using the command line and typing commands whenever possible, as these skills are easily transferable across different IDEs (with a note that you should feel free to use other equivalent ways for doing things that suit you more). However, for tasks like debugging - where the graphical interface offers significant advantages — we will make use of the IDE’s built-in visual tools. ::::::::::::::::::::::::::::::::: group-tab @@ -729,9 +733,9 @@ inflammation-analysis.py: error: the following arguments are required: infiles Process finished with exit code 2 ``` -This is the same error we got when running the script from the command line! Essentially what happened was the IDE opened a command line terminal within its interface -and executed the Python command to run the script for us (`python3 inflammation-analysis.py`) - saving us some typing. -You can carry on to run the Python script in whatever way you find more convenient - some developers prefer to type the commands in a terminal manually as that gives +This is the same error we got when running the script from the command line! Essentially what happened was the IDE opened a command line terminal within its interface +and executed the Python command to run the script for us (`python3 inflammation-analysis.py`) - saving us some typing. +You can carry on to run the Python script in whatever way you find more convenient - some developers prefer to type the commands in a terminal manually as that gives the feel of having more control over what is happening and what commands are being executed. We will get back to the above error shortly - for now, the good thing is that we managed to set up our project for development both from the command line and IDE and are getting the same outputs. diff --git a/episodes/14-collaboration-using-git.md b/episodes/14-collaboration-using-git.md index 1e6df4dbd..8a6c0c21b 100644 --- a/episodes/14-collaboration-using-git.md +++ b/episodes/14-collaboration-using-git.md @@ -113,9 +113,9 @@ sequenceDiagram Staging Area->>+Local Repository: git commit Local Repository->>+Remote Repository: git push Remote Repository->>+Local Repository: git fetch - Local Repository->>+Working Tree:git checkout + Local Repository->>+Working Tree:git restore Local Repository->>+Working Tree:git merge - Remote Repository->>+Working Tree: git pull (shortcut for git fetch followed by git checkout/merge) + Remote Repository->>+Working Tree: git pull (shortcut for git fetch followed by git merge) --> inflammation/views.py:4:17 + | +3 | from matplotlib import pyplot as plt +4 | import numpy as np + | ^^ + | +help: Remove unused import: `numpy` + +Found 1 error. +[*] 1 fixable with the `--fix` option. ``` Your own outputs of the above commands may vary depending on how you have implemented and fixed the code in previous exercises and the coding style you have used. -The five digit codes, such as `C0303`, are unique identifiers for warnings, -with the first character indicating the type of warning. -There are five different types of warnings that Pylint looks for, -and you can get a summary of them by doing: +The alphanumeric codes, such as `F401`, are unique identifiers for lint rules. +Ruff implements rules as derived by other tools and conventions - the starting +letter of the code refers to the tool or convention the rule is derived from. +To learn more about a lint rule, e.g. `F401`, you can run: + +```bash +$ ruff rule F401 +``` +Ruff will tell you that `F401`, as all other `F`-rules, are derived from the +[Pyflakes](https://pypi.org/project/pyflakes/) Python linter, and give you +examples, explanations and some reasoning on why the rule exists. +The full list of rules that Ruff supports is available +[as part of the Ruff documentation](https://docs.astral.sh/ruff/rules/). + +Note that by default Ruff does not check for all rules, but it enables +only a subset that is considered a reasonable choice to identify common errors. +You can enable a specific set of rules using the `--select` option. For instance, +try to include the following set of rules, which are derived from some of the most +popular tools, such as [pycodestyle](https://pypi.org/project/pycodestyle/) +(`E` rules) and [isort](https://pypi.org/project/isort/) (`I` rules): ```bash -$ pylint --long-help +$ ruff check --select E,F,I inflammation ``` -Near the end you'll see: +Ruff will identify more problems in the codebase: ```output - Output: - Using the default text output, the message format is : - MESSAGE_TYPE: LINE_NUM:[OBJECT:] MESSAGE - There are 5 kind of message types : - * (C) convention, for programming standard violation - * (R) refactor, for bad code smell - * (W) warning, for python specific problems - * (E) error, for probable bugs in the code - * (F) fatal, if an error occurred which prevented pylint from doing - further processing. +I001 [*] Import block is un-sorted or un-formatted + --> inflammation/views.py:3:1 + | +1 | """Module containing code for plotting inflammation data.""" +2 | +3 | / from matplotlib import pyplot as plt +4 | | import numpy as np + | |__________________^ + | +help: Organize imports + +F401 [*] `numpy` imported but unused + --> inflammation/views.py:4:17 + | +3 | from matplotlib import pyplot as plt +4 | import numpy as np + | ^^ + | +help: Remove unused import: `numpy` + +Found 2 errors. +[*] 2 fixable with the `--fix` option. ``` -So for an example of a Pylint Python-specific `warning`, -see the "W0611: Unused numpy imported as np (unused-import)" warning. - -It is important to note that while tools such as Pylint are great at giving you +It is important to note that while tools such as Ruff are great at giving you a starting point to consider how to improve your code, they will not find everything that may be wrong with it. -::::::::::::::::::::::::::::::::::::::::: callout +::::::::::::::::::::::::::::::::::::::: challenge -## How Does Pylint Calculate the Score? +## Exercise: Add Ruff configurations to the `pyproject.toml` file -The Python formula used is -(with the variables representing numbers of each type of infraction -and `statement` indicating the total number of statements): +You can define the Ruff configuration for a project by adding a section to the +`pyproject.toml` file. For instance, you can define the set of rules to be checked +for your codebase. Following [the Ruff documentation](https://docs.astral.sh/ruff/linter/), +add a section to the `pyproject.toml` to enable the `E`, `W`, `F`, `UP`, `A`, `B`, `SIM`, +and `I` rules for the project. Verify that the configuration is respected when running +`ruff` (without the `--select` option): ```bash -10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10) +$ ruff check inflammation +``` + +:::::::::::::::::::::::::::::::: solution + +Add the following section to the `pyproject.toml`: + +```toml +[tool.ruff.lint] +select = ["E", "W", "F", "UP", "A", "B", "SIM", "I"] ``` -For example, with a total of 31 statements of models.py and views.py, -with a count of the errors shown above, we get a score of 8.00. -Note whilst there is a maximum score of 10, given the formula, -there is no minimum score - it is quite possible to get a negative score! +Running `ruff check inflammation` should indeed show problems with some of +the `W` and `I` rules, which are not enabled with the default Ruff settings. +:::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::: @@ -165,13 +212,15 @@ there is no minimum score - it is quite possible to get a negative score! ## Exercise: Further Improve Code Style of Our Project -Select and fix a few of the issues with our code that Pylint detected. -Make sure you do not break the rest of the code in the process and that the code still runs. -After making any changes, run Pylint again to verify you have resolved these issues. +Select and fix a few of the issues with our code that Ruff detected. +You can try using the Ruff's `--fix` command-line option to automatically fix +(some of) the issues. If you manually edit the code, make sure you do not break +the rest of the code in the process and that the code still runs. +After making any changes, run Ruff again to verify you have resolved these issues. :::::::::::::::::::::::::::::::::::::::::::::::::: -Make sure you commit and push `requirements.txt` +Make sure you commit and push `requirements.txt`, `pyproject.toml`, and any file with further code style improvements you did on to `style-fixes` branch and then merge all these changes into your development branch. @@ -184,8 +233,8 @@ with GitHub Actions - we will come back to automated linting in the episode on ["Diagnosing Issues and Improving Robustness"](24-diagnosing-issues-improving-robustness.md). ```bash -$ git add requirements.txt -$ git commit -m "Added Pylint library" +$ git add requirements.txt pyproject.toml +$ git commit -m "Added Ruff library" $ git push origin style-fixes $ git switch develop $ git merge style-fixes @@ -197,17 +246,16 @@ $ git push origin develop ## Optional Exercise: Improve Code Style of Your Other Python Projects If you have a Python project you are working on or you worked on in the past, -run it past Pylint to see what issues with your code are detected, if any. - +run it past Ruff to see what issues with your code are detected, if any. :::::::::::::::::::::::::::::::::::::::::::::::::: ::: challenge -## Optional Exercise: More on Pylint +## Optional Exercise: More on Ruff Checkout [this optional exercise](17-section1-optional-exercises.md) -to learn more about `pylint`. +to learn more about `ruff`. ::: diff --git a/episodes/17-section1-optional-exercises.md b/episodes/17-section1-optional-exercises.md index 428c19c6c..4ab36ca7e 100644 --- a/episodes/17-section1-optional-exercises.md +++ b/episodes/17-section1-optional-exercises.md @@ -48,7 +48,7 @@ Some suggestions to try: - [Sublime Text](https://www.sublimetext.com/) - [RStudio](https://posit.co/download/rstudio-desktop/) -The IDEs listed above are advanced source code editors capable of functioning as IDEs by manually installing plugins +The IDEs listed above are advanced source code editors capable of functioning as IDEs by manually installing plugins and add-ons for these tools to obtain more advanced features - such as support for a specific programming language or unit testing. What do you prefer, a lot of tooling out of the box or a lightweight editor with optional extensions? @@ -91,27 +91,38 @@ An open-source alternative is [mini-forge](https://github.com/conda-forge/minifo ::::::::::::::::::::::::::::::::::::::: challenge -## Exercise: Customise `pylint` +## Exercise: Customise `ruff` -Tell `pylint` to accept the maximum line length of 100 characters instead of the default 80. +Tell Ruff to accept the maximum line length of 100 characters instead of the default 79. -Hint: find out different ways in which you can configure `pylint` (e.g. via `pylint` command line interface or its configuration file). +Find out different ways in which you can configure `ruff` (e.g. via `ruff` command line interface or the project configuration file (`pyproject.toml`)). + +Hint: Note that the maximum line lenght will only be enforced if a rule on long-line violations (such as `E501`) is enabled. ::::::::::::::: solution ## Solution -### By passing an argument to `pylint` in the command line +### By passing an argument to Ruff in the command line -Specify the max line length as an argument: `pylint --max-line-length=100` +Specify the max line length as an argument, including all `E` rules: `ruff check --select E --line-length=100` ### Using a configuration file -You can create a file `.pylintrc` in the root of your project folder to overwrite `pylint` settings: +You can add a section to the `pyproject.toml` in the root of your project folder to overwrite Ruff settings: ``` -[FORMAT] -max-line-length=100 +... +[tool.ruff.lint] +select = [ + ... + "E", + ... +] + +[tool.ruff] +line-length = 100 +... ``` ::::::::::::::::::::::::: diff --git a/episodes/22-scaling-up-unit-testing.md b/episodes/22-scaling-up-unit-testing.md index 44e76ba85..26713562d 100644 --- a/episodes/22-scaling-up-unit-testing.md +++ b/episodes/22-scaling-up-unit-testing.md @@ -354,7 +354,7 @@ and allows others to verify against correct behaviour. ## Optional exercises Checkout -[these optional exercises](25-section2-optional-exercises.md) +[these optional exercises](26-section2-optional-exercises.md) to learn more about code coverage. diff --git a/episodes/23-continuous-integration-automated-testing.md b/episodes/23-continuous-integration-automated-testing.md index 19a84d04e..deb585dfb 100644 --- a/episodes/23-continuous-integration-automated-testing.md +++ b/episodes/23-continuous-integration-automated-testing.md @@ -389,7 +389,7 @@ jobs: python3 -m pytest --cov=inflammation.models tests/test_models.py ``` -The `{{ }}` are used +The double-brackets `{` are used as a means to reference configuration values from the matrix. This way, every possible permutation of Python versions 3.10 through 3.12 with the latest versions of Ubuntu, Mac OS and Windows operating systems diff --git a/episodes/24-diagnosing-issues-improving-robustness.md b/episodes/24-diagnosing-issues-improving-robustness.md index 659765aab..479d6dc2b 100644 --- a/episodes/24-diagnosing-issues-improving-robustness.md +++ b/episodes/24-diagnosing-issues-improving-robustness.md @@ -189,7 +189,7 @@ Firstly, to make it easier to track what's going on, we can set up our IDE to ru ### Configuring the Test Framework -::: group-tab +::: group-tab ### PyCharm @@ -207,20 +207,20 @@ You can do this by: ### VS Code If you have not done so already, you will first need to configure the Pytest framework in VS Code. -Open the Test Explorer view (click on the test beaker icon on the VS Code Activity Bar on the left hand side). +Open the Test Explorer view (click on the test beaker icon on the VS Code Activity Bar on the left hand side). -You should see a `Configure Python Tests` button if a test framework is not enabled. +You should see a `Configure Python Tests` button if a test framework is not enabled. Clicking on it prompts you to select a test framework and a folder containing your tests (which in this project, is the `tests` subfolder). ![](fig/vscode-test-framework.png){alt='Setting up test framework in VS Code' .image-with-shadow width="1000px"} ![](fig/vscode-configure-pytest.png){alt='Setting up test framework in VS Code' .image-with-shadow width="1000px"} -Tests can be configured anytime by using the `Python: Configure Tests` command from the Command Palette -or by setting `python.testing.pytestEnabled` in the Settings editor or `settings.json` file (described in the VS Code Settings in the [episode on IDEs](/13-ides.md)). +Tests can be configured anytime by using the `Python: Configure Tests` command from the Command Palette +or by setting `python.testing.pytestEnabled` in the Settings editor or `settings.json` file (described in the VS Code Settings in the [episode on IDEs](/13-ides.md)). Each testing framework also has further specific configuration settings as described in the [Test configuration settings of the VS Code documentation for Python](https://code.visualstudio.com/docs/python/testing#_test-configuration-settings). -::: +::: ### Running the Tests @@ -360,7 +360,7 @@ Recall that the input `data` array we are using for the function is So the maximum inflammation for each patient should be `[3, 6, 9]`, whereas the debugger shows `[7, 8, 9]`. You can see that the latter corresponds exactly to the last column of `data`, and we can immediately conclude that we took the maximum along the wrong axis of `data`. -Now we have our answer, stop the debugging process. +Now we have our answer, stop the debugging process. ::: group-tab @@ -642,7 +642,7 @@ from inflammation.models import patient_normalise ]) def test_patient_normalise(test, expected, expect_raises): """Test normalisation works for arrays of one and positive integers.""" - + if expect_raises is not None: with pytest.raises(expect_raises): patient_normalise(np.array(test)) @@ -835,20 +835,28 @@ This approach is useful when explicitly checking the precondition is too costly. ## Improving Robustness with Automated Code Style Checks -Let us re-run Pylint over our project after having added some more code to it. +Let us re-run Ruff over our project after having added some more code to it. From the project root do: ```bash -$ pylint inflammation +$ ruff check inflammation ``` -You may see something like the following in Pylint's output: +You may see something like the following in Ruff's output: ```bash -************* Module inflammation.models -... -inflammation/models.py:60:4: W0622: Redefining built-in 'max' (redefined-builtin) -... +A001 Variable `max` is shadowing a Python builtin + --> inflammation/models.py:60:5 + | +58 | if np.any(data < 0): +59 | raise ValueError('inflammation values should be non-negative') +60 | max = np.nanmax(data, axis=1) + | ^^^ +61 | with np.errstate(invalid='ignore', divide='ignore'): +62 | normalised = data / max[:, np.newaxis] + | + +Found 1 error. ``` The above output indicates that by using the local variable called `max` @@ -869,78 +877,61 @@ push them to GitHub using our usual feature branch workflow. :::::::::::::::::::::::::::::::::::::::::::::::::: It may be hard to remember to run linter tools every now and then. -Luckily, we can now add this Pylint execution to our continuous integration builds +Luckily, we can now add this Ruff execution to our continuous integration builds as one of the extra tasks. -To add Pylint to our CI workflow, +To add Ruff to our CI workflow, we can add the following step to our `steps` in `.github/workflows/main.yml`: ```bash ... - - name: Check style with Pylint + - name: Check style with Ruff run: | - python3 -m pylint --fail-under=0 --reports=y inflammation + python3 -m ruff check --exit-zero --output-format=github inflammation ... ``` -Note we need to add `--fail-under=0` otherwise -the builds will fail if we do not get a 'perfect' score of 10! -This seems unlikely, so let us be more pessimistic. -We have also added `--reports=y` which will give us a more detailed report of the code analysis. +Note we need to add `--exit-zero` otherwise +the builds will fail if our codebase breaks any of the rule! +We have also added `--output-format=github` to format the output so that it can +more easily be visualized as part of the GitHub action logs. Then we can just add this to our repo and trigger a build: ```bash $ git add .github/workflows/main.yml -$ git commit -m "Add Pylint run to build" +$ git commit -m "Add Ruff run to build" $ git push origin test-suite ``` Then once complete, under the build(s) reports you should see -an entry with the output from Pylint as before, -but with an extended breakdown of the infractions by category -as well as other metrics for the code, -such as the number and line percentages of code, docstrings, comments, and empty lines. - -So we specified a score of 0 as a minimum which is very low. -If we decide as a team on a suitable minimum score for our codebase, -we can specify this instead. -There are also ways to specify specific style rules that shouldn't be broken -which will cause Pylint to fail, -which could be even more useful if we want to mandate a consistent style. - -We can specify overrides to Pylint's rules in a file called `.pylintrc` -which Pylint can helpfully generate for us. -In our repository root directory: - -```bash -$ pylint --generate-rcfile > .pylintrc -``` - -Looking at this file, you'll see it is already pre-populated. -No behaviour is currently changed from the default by generating this file, -but we can amend it to suit our team's coding style. +an entry with the output from Ruff as before, +but with a more concise format. + +We specified `--exit-zero` so that the builds will not fail even if some of +the linting rules are not followed. +If we decide as a team on a set of rules to be followed for our codebase, +we can remove this option. +With the agreed upon style rules that shouldn't be broken specified in the `pyproject.toml` file, +which will cause Ruff to fail, we can use the action to mandate a consistent style. + +We can also specify overrides to Ruff's rules in the `pyproject.toml` file. +For instance, we can ignore particular rules to suit our team's coding style. For example, a typical rule to customise - favoured by many projects - is the one involving line length. -You'll see it is set to 100, so let us set that to a more reasonable 120. -While we are at it, let us also set our `fail-under` in this file: +The default maximum line length as indicated by the +[Python PEP-8 coding conventions](https://peps.python.org/pep-0008/#maximum-line-length) +is 79 characters, so let us set that to a more reasonable 120: ```bash ... -# Specify a score threshold to be exceeded before program exits with error. -fail-under=0 -... -# Maximum number of characters on a single line. -max-line-length=120 +[tool.ruff] +line-length = 120 ... ``` -do not forget to remove the `--fail-under` argument to Pytest -in our GitHub Actions configuration file too, -since we do not need it anymore. - -Now when we run Pylint we will not be penalised for having a reasonable line length. -For some further hints and tips on how to approach using Pylint for a project, -see [this article](https://pythonspeed.com/articles/pylint/). +Now when we run Ruff we will not be penalised for having a reasonable line length. +For some further customization of Ruff for your project, see the +[Ruff Linter documentation](https://docs.astral.sh/ruff/linter/). ## Merging to `develop` Branch diff --git a/episodes/25-type-annotation.md b/episodes/25-type-annotation.md new file mode 100644 index 000000000..92fa7754d --- /dev/null +++ b/episodes/25-type-annotation.md @@ -0,0 +1,273 @@ +--- +title: 2.5 Type annotations +teaching: 15 +exercises: 45 +--- + +::::::::::::::::::::::::::::::::::::::: objectives + +- Understand the advantages of type annotations +- List the most important type checkers +- Apply type annotations to simple functions +- Read parametric types like `list[int]` or `set[str]` +- Understand the use of type annotations in libraries like `pydantic`, `cattrs` or `msgspec`. + +:::::::::::::::::::::::::::::::::::::::::::::::::: + +:::::::::::::::::::::::::::::::::::::::: questions + +- What are type annotations? +- Why are types needed? Our scripts run fine without them! + +:::::::::::::::::::::::::::::::::::::::::::::::::: + +## Why? + +- Check your code for correctness before you run it. +- Force you to handle edge cases. +- Have documentation that is always correct. +- Better autocompletion. +- Automatic serialization. + +## Tools + +There are many type checkers available: + +MyPy +: The OG, developed by none other than Guido van Rossum himself. + +PyRight +: Often more feature rich than MyPy, developed by Microsoft as part of the Python language server. + +Ty +: By Astral. Not yet feature complete, however Astral brought us `ruff` and `uv`, both stellar tools. So this could become the type checker of the future. + +::: info +Tip: try several of these on some of the examples below. Which type checker has the nicest error messages? Did it find all of the bugs? +::: + +::: challenge +### Exercise: type errors + +Can you spot the mistake in the following code? + +```python +def logistic_map(r, x): + return r * x * (1 - x) + +logistic_map("hello", 2.4) +``` + +What does the error message say went wrong? Do you think this is a good message? + +:::: solution +The error says we can't multiply sequences with a non-int of type float and points to the sub-expression `r * x`. However, the mistake is made at the call point by entering a `str` argument in the first place. +:::: + +Change the function signature to: + +```python +def logistic_map(r: float, x: float) -> float: + return r * x * (1 - x) +``` + +Do you see any effect in on the erronous call in your editor? +::: + +### Abstract types + +We don't always care about the precise type of an object. For instance, if we just want to write a for loop over an iterable, and sometimes we want to express that `Any` object will do: + +```python +from typing import Any +from collections.abc import Iterable + +def print_numbered_list(items: Iterable[Any]): + for i, v in enumerate(items): + print(i, v) +``` + +There are many abstract types available in `collections.abc`. + +### Completion + +Write a function that changes all commas to semi-colons. Start by entering the following: + +```python +def semicolonize(s: str) -> str: + return s +``` + +Type a `.` after the `s`. Can you see the completion? + +## Data classes + +::: info +### Data before classes +In many languages structures or records are considered more primitive than classes, not so in Python. We will learn more about classes and their place in software design in part 3. In this section we'll only consider data classes as a means of grouping data. +::: + +Type annotations go really well together with data classes, a means of combining elements into a larger data structure. Python supports creating classes using type annotation like so: + +```python +from dataclasses import dataclass + +@dataclass +class Address: + street: str + number: int + suffix: str | None = None + +address = Address("Science Park", 402, "Matrix THREE") + +print(f"{address.street} {address.number}") +``` + +Now you don't need to define an `__init__` method. There are nice packages that use this technique to allow automatic serialisation and deserialisation. Check out the [`msgspec` package](https://jcristharif.com/msgspec/index.html). + +::: challenge +### Autocompletion + +Write a function that prints an address. How is your IDE behaving with and without type annotation? + +```python +def print_address(a: Address): + ... +``` + +:::: solution +When you use type-annotation, you'll have better auto-completion. +:::: +::: + +::: challenge +### Serialization + +Install `msgspec` and try writing and reading back an `Address` object to JSON. Can you think of the advantages of using this approach over Python native `json.dump`? + +:::: solution +- less code +- automatic validation +- user friendly error reporting +- high performance +:::: +::: + +## Optional: Generics and protocols + +How would we type a function that returns the first element in a list? Suppose that we know that the list contains integers. Then: + +```python +def first(lst: list[int]) -> int: + ... +``` + +But we like to be more generic than that: hence generic types. + +```python +def first[T](lst: list[T]) -> T: + return lst[0] + +first(["a", "b", "c"]) +``` + +::: challenge +Try running `first([])`, does the type checker complain? Write a version of `first` that returns `None` on an empty list. What should the type signature be? + +:::: solution +Use `Optional[T]` or `T | None`. + +```python +def first[T](lst: list[T]) -> T | None: + if not lst: + return None + return lst[0] +``` +:::: +::: + + +## Optional: Complete the `binary_search` example + +We still haven't typed our `binary_search` algorithm completely. We'd like to express the fact that we can only search a list for values of the same type that it contains! We can introduce a type-variable as follows: + +```python +def binary_search[T](lst: list[T], value: T) -> int | None: + ... +``` + +This reads as: `binary_search` introduces an unknown type `T`, such that we expect `list[T]` and `T` to be the types of the arguments to this function. + +::: challenge +Change your `binary_search` function with the above type definition. What does `mypy` say? + +:::: solution +We haven't taught the type checker that our type should be able to handle comparison operations. When a type defines comparison like that, we say that the type is **ordered**. +:::: +::: + +There is no built-in type constraint for ordered types, we'll have to define our own. + +```python +from typing import Protocol, Self + +class Ord(Protocol): + def __lt__(self: Self, other: Self) -> bool: + ... +``` + +The full type definition of `binary_search`: + +```python +def binary_search[T: Ord](lst: list[T], value: T) -> int | None: + low: int = 0 + high: int = len(lst)-1 + while low <= high: + mid: int = (low+high) // 2 + if lst[mid] > value: + high = mid-1 + elif lst[mid] < value: + low = mid+1 + else: + return mid + return -1 +``` + +::: challenge +Try to call `binary_search` in ways that still make the type checker fail. Can you think of properties that we can't express in the type system? + +:::: solution +It is surprisingly hard to find a type in Python that doesn't support the `<` operator. Sometimes this operator doesn't quite capture the meaning of orderedness. In the case of a `set`, the `<` operator checks that one is a subset of the other (a partial but not total order). Types that don't have comparison: `dict`, `complex`. + +Even when all the types are satisfied, there's no way that the type system can check that our input list is actually sorted. We'd have to subtype `list` and ensure that on each mutation the list remains sorted; not impossible, but at this point most of us should agree that we're taking this silly example a bit too far. +:::: +::: + +## For the curious: Algebraic data types + +Now that we know about type unions and type products (tuples, named tuples, or data classes), we have all the ingredients to write [algebraic data types](https://en.wikipedia.org/wiki/Algebraic_data_type). For instance, we can define a linked list: + +```python +type List[T] = tuple[T, List[T]] | None + +def make_list[T](*args: T) -> List[T]: + match args: + case (first, *rest): + return (first, make_list(*rest)) + case _: + return None + +def list_to_str[T](lst: List[T]): + match lst: + case None: + return "None" + case (a, rest): + return str(a) + " : " + list_to_str(rest) + +l: List[int] = make_list(1, 2, 3) +print(l) +print(list_to_str(l)) +``` + +The linked list may seem a bit silly, but we can also define tree structures and use `match/case` to traverse a tree. Data structures can become highly complex, but the type system helps us writing correct code here. + diff --git a/episodes/25-section2-optional-exercises.md b/episodes/26-section2-optional-exercises.md similarity index 100% rename from episodes/25-section2-optional-exercises.md rename to episodes/26-section2-optional-exercises.md diff --git a/episodes/41-code-review.md b/episodes/41-code-review.md index 755e403f0..121e48068 100644 --- a/episodes/41-code-review.md +++ b/episodes/41-code-review.md @@ -567,6 +567,27 @@ This tells the author you are happy for them to merge the pull request. ![](fig/github-merge-pull-request.png){alt='Merging a pull request in GitHub' .image-with-shadow width="800px"} 2. Delete the merged branch to reduce the clutter in the repository. +::: callout + +### Merge via command line + +The `git merge` command provides a way to directly merge branches on your local. In general, when adding changes to the major branches like `develop` or `main`, this is not a good practice since it bypasses the code review process. It is a common practice for an open-source project to protect its most important branches, meaning pushing directly to one of these branches will fail. Therefore, even if you can merge to `develop` or `main` locally, you will not be able to push the changes to the remote repository. + +On the other hand, the `git merge` command can be very useful for keeping your feature branch up to date with the major branches. For example, if you are working on a branch `my-feature` and the `develop` branch has received new commits (e.g. someone else has merged their pull request), you can do the following to include changes from `develop` into your feature branch: + +```bash +# First update your local develop branch +$ git checkout develop +$ git pull develop +# Then switch back to your feature branch and merge the latest develop into it +$ git checkout my-feature +$ git merge develop +``` + +In this way, you can keep your feature branch up to date. You may need to resolve conflicts that arise during this process. + +::: + ## Writing Easy-To-Review Code There are a few things you can do to make it diff --git a/episodes/42-software-reuse.md b/episodes/42-software-reuse.md index ca2b67f8f..7919f2c2e 100644 --- a/episodes/42-software-reuse.md +++ b/episodes/42-software-reuse.md @@ -1,5 +1,5 @@ --- -title: 4.2 Preparing Software for Reuse and Release +title: 4.2 Preparing Software for Reuse start: no teaching: 35 exercises: 15 @@ -14,14 +14,16 @@ exercises: 15 - Understand other documentation components and where they are useful - Describe the basic types of open source software licence - Explain the importance of conforming to data policy and regulation -- Prioritise and work on improvements for release as a team :::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::: questions - What can we do to make our programs reusable by others? -- How should we document and license our code? +- How should we document our code? +- How to make our code citable? +- How to centralize the configuration of our Python project? +- How to add a proper licence to our code? :::::::::::::::::::::::::::::::::::::::::::::::::: @@ -68,6 +70,23 @@ If others are unable to verify that a piece of software follows published algori how can they be certain it is producing correct results? Where 'others', of course, can include a future version of ourselves. +::::::::::::::::::::::::::::::::::::::::: callout + +**Work on a branch** + +In the previous episode, we updated the `develop` branch with a new feature. +In this episode we will continue working on the reusability of our code. +To do this, we will create a branch called `improve-reusability` from the `develop` branch: + +```bash +$ git switch develop +$ git switch -c improve-reusability +``` + +At the end of this episode, we will merge this branch back into `develop`. + +::::::::::::::::::::::::::::::::::::::::: + ## Documenting Code to Improve Reusability Reproducibility is a cornerstone of science, @@ -252,9 +271,181 @@ which may be held within other Markdown files within the repository or elsewhere We will finish these off later. See [Matias Singer's curated list of awesome READMEs](https://github.com/matiassingers/awesome-readme) for inspiration. -### Other Documentation +### Generating and deploying documentation using MKDocs + +[MKDocs](https://www.mkdocs.org/) generates project documentation as a static website from Markdown files. The website can then be hosted on GitHub Pages or other static site hosting services, providing a user-friendly interface for accessing the documentation. + +We can install MKDocs package using `pip`. Here we also install a plugin `mkdocstrings`, which will be used later. +We advise you to do this within a virtual environment you created before: + +```bash +python3 -m pip install mkdocs mkdocstrings[python] +``` + +By default, `mkdocstrings` does not provide support for a specific language. Therefore, we specify `[python]` to install extra dependencies of `mkdocstrings` for Python language support. + +After installation, you can intialize a new MKDocs project in our Python project: + +```bash +python3 -m mkdocs new . +``` + +This will create two files in your project: `mkdocs.yml` and `docs/index.md`. The first file `mkdocs.yml` is the configuration file for your documentation site. It serves as the central configuration hub for your MKDocs documentation. It tells MKDocs how to structure your documentation site, which plugins and themes to use, +how to organize navigation, etc. + +`docs/index.md` is the main page of your documentation. It is usually the landing page of your documentation site. -There are many different types of other documentation you should also consider +Let's first look at the `mkdocs.yml` file. It is almost empty now. We can edit it with the following basic configurations: + +```yaml +site_name: Inflam + +nav: + - Overview: index.md + +plugins: + - search + - mkdocstrings +``` + +Here we give a name to our documentation site, `Inflam`. We set up the navigation menu with one item `Overview` that links to `index.md`. We also enable two plugins, `search` to provide search functionality in the documentation site, and `mkdocstrings` to automatically generate API reference documentation from Python docstrings, which we will see later. + +We can try to render the documentation site locally and see how it looks like: + +```bash +python3 -m mkdocs serve +``` + +This will start to build a local static documentation site and serve it at a local web server. +By default, it will be available at `http://127.0.0.1:8000/`, which will also show in the terminal output. +You can open this URL in your web browser to view the documentation site. + +The documentation site now consists of some default content about MKDocs. It is rendered from the `docs/index.md` file. Let's edit this file to add some relevant content about our project. For simplicity, we can borrow the content from our `README.md` file. + +::::::::::::::::::::::::::::::::: challenge + +### Exercise: Update Documentation Content + +Modify `docs/index.md` with the same content as your `README.md` file. +Render the documentation site locally again with `mkdocs serve`. +Check how it looks like in your web browser. + +::::::::::::::::::::::::::::::::: + +You can also add more pages to your documentation site by creating more Markdown files in the `docs/` directory, and update the `nav` section in `mkdocs.yml` to include these new pages. For example, we can create a new page for API (Application Programming Interface) reference documentation. + +An API reference documents the functions, classes, and methods provided by your software, along with their parameters, return values, and usage examples. This is particularly useful for understanding how to interact with your code programmatically. With `mkdocs` and `mkdocstrings` plugin, we can automatically generate API reference documentation from the docstrings in our Python code. + +Let's first create `docs/API.md` with the following content: + +```markdown +# API Reference + +:::inflammation.models +``` + +Apart from the title, there is only one line `:::inflammation.models` in this file. This is a special syntax provided by the `mkdocstrings` plugin to indicate that we want to generate API documentation for the `inflammation.models` module. The plugin will parse the docstrings in this module and generate the corresponding documentation. + +Now we can call `mkdocs serve` again to render the documentation site locally and check how the API reference page looks like. + +Now we can see that all the functions defined in the `inflammation.models` module are automatically documented with their docstrings. + +One can make the rendered API documentation more informative by improving the docstrings in the code. For example, we can improve the docstring of the `load_csv` function by following the `numpy` style docstring format. Let's update the doctring of `load_csv` as below: + +```python +def load_csv(filename: str) -> np.ndarray: +"""Load a Numpy array from csv + + Parameters + ---------- + filename : str + path to the csv file + + Returns + ------- + np.ndarray + 2D array of inflammation data +""" +``` + +And also configure `mkdocs.yml` to use `numpy` style docstring format for `mkdocstrings` plugin: + +```yaml +site_name: Inflam + +nav: + - Overview: index.md + - API Reference: api.md + +plugins: + - search + - mkdocstrings: + handlers: + python: + options: + docstring_style: numpy +``` + +Then we can render the documentation site locally again with `mkdocs serve`, the input parameters and return values of the `load_csv` function are now nicely formatted in a table. + + +Once you are happy with the documentation site, you can deploy it to GitHub Pages so that others can access it online. First, let's commit the changes we made to the repository: + +```bash +git add inflammation/models.py mkdocs.yml docs/ +git commit -m "Add documentation with MKDocs" +``` + +To deploy the documentation to GitHub Pages, you can use the following command: +```bash +mkdocs gh-deploy +``` + +This command assumes you have access to the GitHub repository of the current project. It will automatically create a new branch called `gh-pages` in your repository, which will contain the static files of your documentation site, and push this branch to GitHub. + +![](./fig/github-gh-page-settings.png){alt='GitHub Pages settings details' .image-with-shadow width="800px"} + +Now go check your repository's GitHub Pages "Settings -> Pages", you should see the link to your documentation site, which should be like: `https://.github.io/python-intermediate-inflammation/`. You can add this link to your GitHub repository landing page description. + + +::::::::::::::::::::::::::::::::::::::::: callout + +**Deploy documentation with GitHub actions** + +It is also possible to automate the deployment of documentation site using GitHub Actions. + +Below is an example of GitHub Actions workflow file to deploy the documentation site whenever there is a push to any branch in the repository. Note that a better practice is to only deploy the documentation when there is a update to the `main` branch, or when there is a new release. + +```yaml +name: Deploy docs + +on: [push] + +jobs: + build: + name: Deploy docs + runs-on: ubuntu-latest + steps: + - name: Checkout main + uses: actions/checkout@v4 + with: + fetch-depth: 0 + - name: Set up Python 3.9 + uses: actions/setup-python@v5 + with: + python-version: "3.9" + - name: Install dependencies + run: | + python -m pip install .[docs] + - name: Deploy docs + run: mkdocs gh-deploy --force +``` + +::::::::::::::::::::::::::::::::::::::::: + +### Thinking about the audience for your documentation + +Besides the API documentation we added by MKDocs, there are many different types of documentation you should consider writing and making available that's beyond the scope of this course. The key is to consider which audiences you need to write for, e.g. end users, developers, maintainers, etc., @@ -264,16 +455,91 @@ There is a Software Sustainability Institute that helpfully covers the kinds of documentation to consider and other effective ways to convey the same information. -One that you should always consider is **technical documentation**. -This typically aims to help other developers understand your code -sufficiently well to make their own changes to it, -including external developers, other members in your team and a future version of yourself too. -This may include documentation that covers the software's architecture, -including its different components and how they fit together, -API (Application Programming Interface) documentation -that describes the interface points designed into your software for other developers to use, -e.g. for a software library, -or technical tutorials/'HOW TOs' to accomplish developer-oriented tasks. + +## Configuring your code with `pyproject.toml` + +[`pyproject.toml`](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/) is a standardized configuration file, written in TOML format, used in Python projects to declare build system requirements, metadata, and tool configuration. It serves as a central place to manage various aspects of a Python project, making it easier to build, package, and distribute the project. + +We can take a look at the current state of the `pyproject.toml` file in our project: + +```toml +[build-system] +requires = ["setuptools"] +build-backend = "setuptools.build_meta" + +[project] +name = "python-intermediate-inflammation" +version = "0.0.0" +requires-python = ">=3.9" + +[tool.setuptools] +packages = ["inflammation"] +``` + +It defines three main sections of a Python project as three tables: + +- The `[build-system]` table allows you to declare which build backend you use and which other dependencies are needed to build your project. + +- The `[project]` table, which specifies your project’s basic metadata, such as the project name, author name(s), dependencies, and more. + +- The `[tool]` table has tool-specific subtables, e.g., `[tool.setuptools]`, the content of which is defined by each tool, allowing you to configure various aspects of the tool's behavior. + +We can improve the `pyproject.toml` file by adding some metadata to our project. Let's update the `[project]` table as below: + +```toml +[project] +name = "python-intermediate-inflammation" +version = "0.0.0" +requires-python = ">=3.9" +description = "A Python data management system that manages trial data used in clinical inflammation studies." +readme = "README.md" +``` + +Here we added a short description of our project and specified the README file. In practice, `pyproject.toml` can contain many other metadata fields as well as configuration for various tools. The [pyproject.toml documentation](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/) provides more details. One advantage of using `pyproject.toml` is that it integrates with modern Python packaging tools like [`uv`](https://docs.astral.sh/uv/), which we will see in the next section about releasing our Python project. + +Do not forget to commit the changes we made to `pyproject.toml` file. + +```bash +git add pyproject.toml +git commit -m "Update pyproject.toml with dependencies" +``` + +## Make your code citable by adding a CITATION File + +It is easy to correctly cite a paper: all the necessary information (metadata) can be found on the title page or the article website. + +Software and datasets have no title page, the relevant information is often less obvious. To get credit for your work, you should provide citation information for your software. + +A good way to add citation information is by including a [CITATION.cff](https://citation-file-format.github.io/) file (Citation File Format) in the root of your repository. This plain text file, written in YAML format, contains all the necessary citation details in a structured manner. + + +![](./fig/github-citation-file-rendered.png){alt='CITATION.cff rendered on GitHub' .image-with-shadow width="600px"} + +Platforms like GitHub, Zenodo, and Zotero reuse the citation metadata you provide. GitHub, for example, automatically renders the file on the repository landing page and provides a BibTeX snippet which users can simply copy! + +### Minimal example for a CITATION.cff file + +```yaml +authors: + - family-names: Doe + given-names: John +cff-version: 1.2.0 +message: "If you use this software, please cite it using the metadata from this file." +title: "Inflam" +``` +We can also include other important information of software such as version, release date, DOI, license, keywords. + +#### How to create a CITATION.cff file? + +You can use the [cffinit](https://citation-file-format.github.io/cff-initializer-javascript/) tool to create a citation file. + +:::challenge +### Exercise: Create a CITATION.cff using cffinit +1. Follow [these steps to create a CITATION file with cffinit](https://book.the-turing-way.org/communication/citable/citable-cffinit). +1. Rename the created file to `CITATION.cff` and add it to the root folder of your repository. +1. Push your changes to feature branch and check your repository in GitHub. What has happened? +::: + ## Choosing an Open Source Licence @@ -329,181 +595,20 @@ If you want more information, or help choosing a licence, the [Choose An Open-Source Licence](https://choosealicense.com/) or [tl;dr Legal](https://tldrlegal.com/) sites can help. -::::::::::::::::::::::::::::::::::::::: challenge - -## Exercise: Preparing for Release - -In a (hopefully) highly unlikely and thoroughly unrecommended scenario, -your project leader has informed you of the need to release your software -within the next half hour, -so it can be assessed for use by another team. -You'll need to consider finishing the README, -choosing a licence, -and fixing any remaining problems you are aware of in your codebase. -Ensure you prioritise and work on the most pressing issues first! - -:::::::::::::::::::::::::::::::::::::::::::::::::: - -## Merging into `main` - -Once you have done these updates, -commit your changes, -and if you are doing this work on a feature branch also ensure you merge it into `develop`, -e.g.: -```bash -$ git switch develop -$ git merge my-feature-branch -``` +:::challenge +### Exercise: Add a Licence to Your Code +Select a licence for your code using the tool above. +Replace the contents of the `LICENSE.md` file in your repository with the text of the licence you have chosen. +Push your changes to your feature branch and check your repository in GitHub. What has happened? +::: -Finally, once we have fully tested our software -and are confident it works as expected on `develop`, -we can merge our `develop` branch into `main`: - -```bash -$ git switch main -$ git merge develop -$ git push origin main -``` - -The software on your `main` branch is now ready for release. - -## Tagging a Release in GitHub - -There are many ways in which Git and GitHub can help us make a software release from our code. -One of these is via **tagging**, -where we attach a human-readable label to a specific commit. -Let us see what tags we currently have in our repository: - -```bash -$ git tag -``` - -Since we have not tagged any commits yet, there is unsurprisingly no output. -We can create a new tag on the last commit in our `main` branch by doing: - -```bash -$ git tag -a v1.0.0 -m "Version 1.0.0" -``` - -So we can now do: - -```bash -$ git tag -``` - -```output -v.1.0.0 -``` - -And also, for more information: - -```bash -$ git show v1.0.0 -``` - -You should see something like this: - -```output -tag v1.0.0 -Tagger: -Date: Fri Dec 10 10:22:36 2021 +0000 - -Version 1.0.0 - -commit 2df4bfcbfc1429c12f92cecba751fb2d7c1a4e28 (HEAD -> main, tag: v1.0.0, origin/main, origin/develop, origin/HEAD, develop) -Author: -Date: Fri Dec 10 10:21:24 2021 +0000 - - Finalising README. - -diff --git a/README.md b/README.md -index 4818abb..5b8e7fd 100644 ---- a/README.md -+++ b/README.md -@@ -22,4 +22,33 @@ Flimflam requires the following Python packages: - The following optional packages are required to run Flimflam's unit tests: - - - [pytest](https://docs.pytest.org/en/stable/) - Flimflam's unit tests are written using pytest --- [pytest-cov](https://pypi.org/project/pytest-cov/) - Adds test coverage stats to unit testing -\ No newline at end of file -+- [pytest-cov](https://pypi.org/project/pytest-cov/) - Adds test coverage stats to unit testing -+ -+## Installation -+- Clone the repo ``git clone repo`` -+- Check everything runs by running ``python -m pytest`` in the root directory -+- Hurray -+ -+## Contributing -+- Create an issue [here](https://github.com/Onoddil/python-intermediate-inflammation/issues) -+ - What works, what does not? You tell me -+- Randomly edit some code and see if it improves things, then submit a [pull request](https://github.com/Onoddil/python-intermediate-inflammation/pulls) -+- Just yell at me while I edit the code, pair programmer style! -+ -+## Getting Help -+- Nice try -+ -+## Credits -+- Directed by Michael Bay -+ -+## Citation -+Please cite [J. F. W. Herschel, 1829, MmRAS, 3, 177](https://ui.adsabs.harvard.edu/abs/1829MmRAS...3..177H/abstract) if you used this work in your day-to-day life. -+Please cite [C. Herschel, 1787, RSPT, 77, 1](https://ui.adsabs.harvard.edu/abs/1787RSPT...77....1H/abstract) if you actually use this for scientific work. -+ -+## License -+This source code is protected under international copyright law. All rights -+reserved and protected by the copyright holders. -+This file is confidential and only available to authorized individuals with the -+permission of the copyright holders. If you encounter this file and do not have -+permission, please contact the copyright holders and delete this file. -\ No newline at end of file -``` - -So now we have added a tag, we need this reflected in our Github repository. -You can push this tag to your remote by doing: - -```bash -$ git push origin v1.0.0 -``` - -::::::::::::::::::::::::::::::::::::::::: callout - -## What is a Version Number Anyway? - -Software version numbers are everywhere, -and there are many different ways to do it. -A popular one to consider is [**Semantic Versioning**](https://semver.org/), -where a given version number uses the format MAJOR.MINOR.PATCH. -You increment the: - -- MAJOR version when you make incompatible API changes -- MINOR version when you add functionality in a backwards compatible manner -- PATCH version when you make backwards compatible bug fixes - -You can also add a hyphen followed by characters to denote a pre-release version, -e.g. 1.0.0-alpha1 (first alpha release) or 1.2.3-beta4 (fourth beta release) - - -:::::::::::::::::::::::::::::::::::::::::::::::::: - -We can now use the more memorable tag to refer to this specific commit. -Plus, once we have pushed this back up to GitHub, -it appears as a specific release within our code repository -which can be downloaded in compressed `.zip` or `.tar.gz` formats. -Note that these downloads just contain the state of the repository at that commit, -and not its entire history. - -Using features like tagging allows us to highlight commits that are particularly important, -which is very useful for *reproducibility* purposes. -We can (and should) refer to specific commits for software in -academic papers that make use of results from software, -but tagging with a specific version number makes that just a little bit easier for humans. ## Conforming to Data Policy and Regulation We may also wish to make data available to either be used with the software or as generated results. -This may be via GitHub or some other means. +This may be via some means other than GitHub, such as Zenodo, Figshare, or an institutional repository. An important aspect to remember with sharing data on such systems is that they may reside in other countries, and we must be careful depending on the nature of the data. @@ -517,11 +622,19 @@ and even international policies and laws. Within Europe, for example, there is the need to conform to things like [GDPR][gdpr]. it is a very good idea to make yourself aware of these aspects. +## Merge your changes to the `main` branch + +After completing all the changes to improve the reusability of your code, you can first merge your feature branch to the `devlop` branch. Then merge the `develop` branch to the `main` branch. +In the next Section, we will look at how to release your Python project from the `main` branch :::::::::::::::::::::::::::::::::::::::: keypoints -- The reuse battle is won before it is fought. Select and use good practices consistently throughout development and not just at the end. +- Add README file for general documentation about your software +- Use MKDocs to generate and deploy documentation site +- Use `pyproject.toml` to centralize configuration of your Python project +- Add `CITATION.cff` file to make your code citable +- Choose an `LICENSE` file to specify the open source licence of your code :::::::::::::::::::::::::::::::::::::::::::::::::: diff --git a/episodes/43-software-release.md b/episodes/43-software-release.md index 64a8c4901..199a04c1e 100644 --- a/episodes/43-software-release.md +++ b/episodes/43-software-release.md @@ -1,5 +1,5 @@ --- -title: 4.3 Packaging Code for Release and Distribution +title: 4.3 Software Packaging and Release teaching: 0 exercises: 20 --- @@ -7,7 +7,7 @@ exercises: 20 ::::::::::::::::::::::::::::::::::::::: objectives - Describe the steps necessary for sharing Python code as installable packages. -- Use Poetry to prepare an installable package. +- Use `uv` to prepare an installable package. - Explain the differences between runtime and development dependencies. :::::::::::::::::::::::::::::::::::::::::::::::::: @@ -19,209 +19,163 @@ exercises: 20 :::::::::::::::::::::::::::::::::::::::::::::::::: -## Why Package our Software? +We have now got our software ready to release. All the changes were made in the `main` branch. +The last step is to give it a version number and find a proper way to distribute it. +We will look at how to make a release on GitHub, +then we will look at how to package our code so that others can easily install and use it. -We have now got our software ready to release - -the last step is to package it up so that it can be distributed. +## Update the Version Number in `pyproject.toml` -For very small pieces of software, -for example a single source file, -it may be appropriate to distribute to non-technical end-users as source code, -but in most cases we want to bundle our application or library into a package. -A package is typically a single file which contains within it our software -and some metadata which allows it to be installed and used more simply - -e.g. a list of dependencies. -By distributing our code as a package, -we reduce the complexity of fetching, installing and integrating it for the end-users. - -In this session we will introduce -one widely used method for building an installable package from our code. -There are range of methods in common use, -so it is likely you will also encounter projects which take different approaches. - -There is some confusing terminology in this episode around the use of the term "package". -This term is used to refer to both: - -- A directory containing Python files / modules and an `__init__.py` - a "module package" -- A way of structuring / bundling a project for easier distribution and installation - - a "distributable package" +We start on the `main` branch. -## Packaging our Software with Poetry - -### Installing Poetry +```bash +$ git switch main +``` -Because we have recommended GitBash if you are using Windows, -we are going to install Poetry using a different method to the officially recommended one. -If you are on MacOS or Linux, -are comfortable with installing software at the command line -and want to use Poetry to manage multiple projects, -you may instead prefer to follow the official -[Poetry installation instructions](https://python-poetry.org/docs/#installation). +First we need to update the version number in our code. +This is typically done in the `pyproject.toml` for Python projects. +We currently have the version set to `0.0.0`, so let us update it to `0.1.0`: -We can install Poetry much like any other Python distributable package, using `pip`: -```bash -$ source venv/bin/activate -$ python3 -m pip install poetry +```toml +[project] +... +version = "0.1.0" +... ``` -To test, we can ask where Poetry is installed: +Then we need to commit this change to the `main` branch: ```bash -$ which poetry +$ git add pyproject.toml +$ git commit -m "Update version number to 0.1.0" ``` -```output -/home/alex/python-intermediate-inflammation/venv/bin/poetry -``` +Note that this is usually done on a feature branch and then merged into `main` with a pull request. -If you do not get similar output, -make sure you have got the correct virtual environment activated. -Poetry can also handle virtual environments for us, -so in order to behave similarly to how we used them previously, -let us change the Poetry config to put them in the same directory as our project: +::::::::::::::::::::::::::::::::::::::::: callout -```bash -$ poetry config virtualenvs.in-project true -``` +## What is a Version Number Anyway? -### Setting up our Poetry Config +Software version numbers are everywhere, +and there are many different ways to do it. +A popular one to consider is [**Semantic Versioning**](https://semver.org/), +where a given version number uses the format MAJOR.MINOR.PATCH. +You increment the: -Poetry uses a **pyproject.toml** file to describe -the build system and requirements of the distributable package. -This file format was introduced to solve problems with bootstrapping packages -(the processing we do to prepare to process something) -using the older convention with **setup.py** files and to support a wider range of build tools. -It is described in -[PEP 518 (Specifying Minimum Build System Requirements for Python Projects)](https://www.python.org/dev/peps/pep-0518/). +- MAJOR version when you make incompatible API changes +- MINOR version when you add functionality in a backwards compatible manner +- PATCH version when you make backwards compatible bug fixes -Make sure you are in the root directory of your software project -and have activated your virtual environment, -then we are ready to begin. +You can also add a hyphen followed by characters to denote a pre-release version, +e.g. 1.0.0-alpha1 (first alpha release) or 1.2.3-beta4 (fourth beta release) -To create a `pyproject.toml` file for our code, we can use `poetry init`. -This will guide us through the most important settings - -for each prompt, we either enter our data or accept the default. -*Displayed below are the questions you should see -with the recommended responses to each question so try to follow these, -although use your own contact details!* +:::::::::::::::::::::::::::::::::::::::::::::::::: -**NB: When you get to the questions about defining our dependencies, -answer no, so we can do this separately later.** +## Tagging a Release in GitHub -::::::::::::::::::::::::::::::::::::::::: callout +There are many ways in which Git and GitHub can help us make a software release from our code. +For example, we can use GitHub website to create a new release. +In this episode, we will look at how to do this using Git **tagging** at the command line. -## It's Not Interactive? +Let us see what tags we currently have in our repository: -If you're using Git Bash for Windows, depending on your configuration, you may find after typing this command that you don't have an interactive set of questions displayed. -Instead, you may find the `pyproject.toml` file is simply generated with a set of default values. +```bash +$ git tag +``` -If this happens, you can edit the `pyproject.toml` file and change the values in this file, similarly to how we have in the output below. +Since we have not tagged any commits yet, there is unsurprisingly no output. +We can create a new tag on the last commit in our `main` branch by doing: + +```bash +$ git tag -a 0.1.0 -m "Version 0.1.0" +``` -::::::::::::::::::::::::::::::::::::::::: +So we can check the tags again: ```bash -$ poetry init +$ git tag ``` +A tag should now be listed: + ```output -This command will guide you through creating your pyproject.toml config. +0.1.0 +``` -Package name [example]: inflammation -Version [0.1.0]: 1.0.0 -Description []: Analyse patient inflammation data -Author [None, n to skip]: James Graham -License []: MIT -Compatible Python versions [>=3.13]: >=3.10 +And also, for more information: -Would you like to define your main dependencies interactively? (yes/no) [yes] no -Would you like to define your development dependencies interactively? (yes/no) [yes] no -Generated file +```bash +$ git show 0.1.0 +``` -[project] -name = "inflammation" -version = "1.0.0" -description = "Analyse patient inflammation data" -authors = [ - {name = "James Graham",email = "J.Graham@software.ac.uk"} -] -license = {text = "MIT"} -readme = "README.md" -requires-python = ">=3.10" -dependencies = [ -] +So now we have added a tag, we need this reflected in our Github repository. +You can push this tag to your remote by doing: +```bash +$ git push origin 0.1.0 +``` -[build-system] -requires = ["poetry-core>=2.0.0,<3.0.0"] -build-backend = "poetry.core.masonry.api" +We can now use the more memorable tag to refer to this specific commit. +Plus, once we have pushed this back up to GitHub, +it appears as a specific release within our code repository +which can be downloaded in compressed `.zip` or `.tar.gz` formats. +Note that these downloads just contain the state of the repository at that commit, +and not its entire history. +Using tagging allows us to highlight commits that are particularly important, +which is very useful for *reproducibility* purposes. +We can (and should) refer to specific commits for software in +academic papers that make use of results from software, +but tagging with a specific version number makes that just a little bit easier for humans. -Do you confirm generation? (yes/no) [yes] yes -``` -Note that we have called our package "inflammation" in the setup above, -instead of "inflammation-analysis". -This is because Poetry will automatically find our code -if the name of the distributable package matches the name of our module package. -If we wanted our distributable package to have a different name, -for example "inflammation-analysis", -we could do this by explicitly listing the module packages to bundle - -see [the Poetry docs on packages](https://python-poetry.org/docs/pyproject/#packages) -for how to do this. +## Packaging our Software with uv -### Project Dependencies +For very small pieces of software, +for example a single source file, +it may be appropriate to distribute to non-technical end-users as source code, +but in most cases we want to bundle our application or library into a package. +A package is typically a single file which contains within it our software +and some metadata which allows it to be installed and used more simply - +e.g. a list of dependencies. +By distributing our code as a package, +we reduce the complexity of fetching, installing and integrating it for the end-users. -Previously, we looked at using a `requirements.txt` file to define the dependencies of our software. -Here, Poetry takes inspiration from package managers in other languages, -particularly NPM (Node Package Manager), -often used for JavaScript development. +:::::::::::::::::::::::::::::::::::::::::: callout -Tools like Poetry and NPM understand that there are two different types of dependency: -runtime dependencies and development dependencies. -Runtime dependencies are those dependencies that -need to be installed for our code to run, like NumPy. -Development dependencies are dependencies which -are an essential part of your development process for a project, -but are not required to run it. -Common examples of developments dependencies are linters and test frameworks, -like `pylint` or `pytest`. - -When we add a dependency using Poetry, -Poetry will add it to the list of dependencies in the `pyproject.toml` file, -add a reference to it in a new `poetry.lock` file, -and automatically install the package into our virtual environment. -If we do not yet have a virtual environment activated, -Poetry will create it for us - using the name `.venv`, -so it appears hidden unless we do `ls -a`. -Because we have already activated a virtual environment, Poetry will use ours instead. -The `pyproject.toml` file has two separate lists, -allowing us to distinguish between runtime and development dependencies. +## Further reading: Python Packaging User Guide + +You can refer to the [Python Packaging User Guide](https://packaging.python.org/) +for documentation on best practices and tools for packaging Python projects. + +At the end of this episode, there is an optional exercise +where you can try more good practices for packaging your Python project + +:::::::::::::::::::::::::::::::::::::::::: + +For packaging our code, we will introduce `uv`, +an extremely fast Python package and project manager, written in Rust. + + +### Installing uv + +On MacOS or Linux, you can install `uv` using the following command: ```bash -$ poetry add matplotlib numpy -$ poetry add --group dev pylint -$ poetry install +curl -LsSf https://astral.sh/uv/install.sh | sh ``` -These two sets of dependencies will be used in different circumstances. -When we build our package and upload it to a package repository, -Poetry will only include references to our runtime dependencies. -This is because someone installing our software through a tool like `pip` is only using it, -but probably does not intend to contribute to the development of our software -and does not require development dependencies. +On Windows, you can install `uv` by: -In contrast, if someone downloads our code from GitHub, -together with our `pyproject.toml`, -and installs the project that way, -they will get both our runtime and development dependencies. -If someone is downloading our source code, -that suggests that they intend to contribute to the development, -so they will need all of our development tools. +```powershell +powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" +``` -Have a look at the `pyproject.toml` file again to see what's changed. +You can refer to the [uv installation documentation](https://docs.astral.sh/uv/getting-started/installation/) +for more details on installation. ### Packaging Our Code @@ -229,76 +183,139 @@ The final preparation we need to do is to make sure that our code is organised in the recommended structure. This is the Python module structure - a directory containing an `__init__.py` and our Python source code files. -Make sure that the name of this Python package -(`inflammation` - unless you have renamed it) -matches the name of your distributable package in `pyproject.toml` -unless you have chosen to explicitly list the module packages. +Make sure that the name of this directory is `inflammation`, which +matches the following section in `pyproject.toml`: -By convention distributable package names use hyphens, -whereas module package names use underscores. -While we could choose to use underscores in a distributable package name, -we cannot use hyphens in a module package name, -as Python will interpret them as a minus sign in our code when we try to import them. +```toml +[tool.setuptools] +packages = ["inflammation"] +``` -Once we have got our `pyproject.toml` configuration done and our project is in the right structure, +Then "inflammation" will be the name of the package when it is installed. +Of course you can choose different names for your package, +but you must ensure that the `pyproject.toml` file is updated accordingly. + +Once we have made sure our project is in the right structure, we can go ahead and build a distributable version of our software: ```bash -$ poetry build +$ uv build +``` + +This should produce two files in the `dist/` directory: + +```output +dist/python_intermediate_inflammation-0.1.0.tar.gz +dist/python_intermediate_inflammation-0.1.0-py3-none-any.whl ``` -This should produce two files for us in the `dist` directory. The one we care most about is the `.whl` or **wheel** file. This is the file that `pip` uses to distribute and install Python packages, so this is the file we would need to share with other people who want to install our software. +By convention distributable package names use hyphens, +whereas module package names use underscores. +While we could choose to use underscores in a distributable package name, +we cannot use hyphens in a module package name, +as Python will interpret them as a minus sign in our code when we try to import them. + Now if we gave this wheel file to someone else, -they could install it using `pip` - -you do not need to run this command yourself, -you have already installed it using `poetry install` above. +they could install it using `pip` (you do not need to run this command yourself) ```bash -$ python3 -m pip install dist/inflammation*.whl +$ pip install python_intermediate_inflammation-0.1.0-py3-none-any.whl ``` -The star in the line above is a **wildcard**, -that means Bash should use any filenames that match that pattern, -with any number of characters in place for the star. -We could also rely on Bash's autocomplete functionality and type `dist/inflammation`, -then hit the Tab key if we have only got one version built. +And then they could import our package in their own Python environment: + +```python +import inflammation +``` After we have been working on our code for a while and want to publish an update, we just need to update the version number in the `pyproject.toml` file (using [SemVer](https://semver.org/) perhaps), -then use Poetry to build and publish the new version. -If we do not increment the version number, -people might end up using this version, -even though they thought they were using the previous one. -Any re-publishing of the package, no matter how small the changes, -needs to come with a new version number. -The advantage of [SemVer](https://semver.org/) is that the change in the version number -indicates the degree of change in the code and thus the degree of risk of breakage when we update. +then use `uv` to build and publish the new version. + +`uv` can help easily increment the version number following Semantic Versioning conventions. +For example, to increment the minor version number, we can do: ```bash -$ poetry build +$ uv version --bump minor ``` -In addition to the commands we have already seen, -Poetry contains a few more that can be useful for our development process. -For the full list see the [Poetry CLI documentation](https://python-poetry.org/docs/cli/). +Then the version number in `pyproject.toml` will be updated from `0.1.0` to `0.2.0`. + +For more information about versioning with `uv`, see the [uv versioning documentation](https://docs.astral.sh/uv/guides/package/#updating-your-version). + +### Project Dependencies + +Tools like `uv` understand that there are two different types of dependency: +runtime dependencies and development dependencies. +Runtime dependencies are those dependencies that +need to be installed for our code to run, like `numpy`. +Development dependencies are dependencies which +are an essential part of your development process for a project, +but are not required to run it. Like `mkdocs` we used to build our documentation. + +When we add a dependency using `uv`, +`uv` will add it to the list of dependencies in the `pyproject.toml` file, +and automatically install the package into our virtual environment, even if the +virtual environment is not currently activated. + +For example, one can add `numpy matplotlib` as a runtime dependency by doing: + +```bash +$ uv add numpy matplotlib +``` + +This will add `numpy` and `matplotlib` to the list of runtime dependencies in `pyproject.toml`: + +```toml +[project] +name = "python-intermediate-inflammation" +version = "0.2.0" +requires-python = ">=3.9" +dependencies = [ + "matplotlib>=3.9.4", + "numpy>=2.0.2", +] +``` + +This is an alternative way to specify the dependencies than the `requirements.txt` file we created before. +The advantage of specifying dependencies in `pyproject.toml`, is that it centralizes this information in one place, +and we can also make a distinction between runtime and development dependencies. + +To add `mkdocs` as a development dependency, the `--group` option can be used: + +```bash +$ uv add --group dev mkdocs +``` + +This will add a new section to the `pyproject.toml` file for development dependencies: + +```toml +[dependency-groups] +dev = [ + "mkdocs>=1.6.1", +] +``` + +By default, when someone installs our package using `pip`, +only the runtime dependencies will be installed, as development dependencies are not needed to run the code. + +To install the development dependencies, one need to clone our repository +from GitHub and then specify the `dev` extra when installing: + +```bash +$ pip install .[dev] +``` -The final step is to publish our package to a package repository. -A package repository could be either public or private - -while you may at times be working on public projects, -it is likely the majority of your work will be published internally -using a private repository such as JFrog Artifactory. -Every repository may be configured slightly differently, -so we will leave that to you to investigate. +This behavior can be customized in the `pyproject.toml` file. Check the +[uv documentation on dependencies](https://docs.astral.sh/uv/concepts/projects/dependencies/?utm_source=chatgpt.com) ## What if We Need More Control? -Sometimes we need more control over the process of -building our distributable package than Poetry allows. There many ways to distribute Python code in packages, with some degree of flux in terms of which methods are most popular. For a more comprehensive overview of Python packaging you can see the @@ -331,7 +348,7 @@ to improve the information your package. :::::::::::::::::::::::::::::::::::::::: keypoints -- Poetry allows us to produce an installable package and upload it to a package repository. +- `uv` allows us to produce an installable package and upload it to a package repository. - Making our software installable with Pip makes it easier for others to start using it. - For complete control over building a package, we can use a `setup.py` file. diff --git a/episodes/fig/git-lifecycle.svg b/episodes/fig/git-lifecycle.svg index dc31430bc..ea351d8b2 100644 --- a/episodes/fig/git-lifecycle.svg +++ b/episodes/fig/git-lifecycle.svg @@ -1 +1 @@ -Remote RepositoryLocal RepositoryStaging AreaWorking TreeRemote RepositoryLocal RepositoryStaging AreaWorking Treegit addgit commitgit pushgit fetchgit checkoutgit mergegit pull (shortcut for git fetch followed by git checkout/merge \ No newline at end of file +Remote RepositoryLocal RepositoryStaging AreaWorking TreeRemote RepositoryLocal RepositoryStaging AreaWorking Treegit addgit commitgit pushgit fetchgit restoregit mergegit pull (shortcut for git fetch followed by git merge) \ No newline at end of file diff --git a/episodes/fig/github-citation-file-rendered.png b/episodes/fig/github-citation-file-rendered.png new file mode 100644 index 000000000..e65ab87c8 Binary files /dev/null and b/episodes/fig/github-citation-file-rendered.png differ diff --git a/episodes/fig/github-gh-page-settings.png b/episodes/fig/github-gh-page-settings.png new file mode 100644 index 000000000..3a07fbdc0 Binary files /dev/null and b/episodes/fig/github-gh-page-settings.png differ diff --git a/episodes/fig/pycharm-add-library.png b/episodes/fig/pycharm-add-library.png index c93f2f9e1..20712f754 100644 Binary files a/episodes/fig/pycharm-add-library.png and b/episodes/fig/pycharm-add-library.png differ diff --git a/episodes/fig/pycharm-add-run-configuration.png b/episodes/fig/pycharm-add-run-configuration.png index ffe0f950b..8079e0fc8 100644 Binary files a/episodes/fig/pycharm-add-run-configuration.png and b/episodes/fig/pycharm-add-run-configuration.png differ diff --git a/episodes/fig/pycharm-code-completion.png b/episodes/fig/pycharm-code-completion.png index 1f78f2aff..35f0c4556 100644 Binary files a/episodes/fig/pycharm-code-completion.png and b/episodes/fig/pycharm-code-completion.png differ diff --git a/episodes/fig/pycharm-code-reference.png b/episodes/fig/pycharm-code-reference.png index 2b4b9bdb4..2f652e8f8 100644 Binary files a/episodes/fig/pycharm-code-reference.png and b/episodes/fig/pycharm-code-reference.png differ diff --git a/episodes/fig/pycharm-code-search.png b/episodes/fig/pycharm-code-search.png index 025049ee2..28c67ba62 100644 Binary files a/episodes/fig/pycharm-code-search.png and b/episodes/fig/pycharm-code-search.png differ diff --git a/episodes/fig/pycharm-configuring-interpreter.png b/episodes/fig/pycharm-configuring-interpreter.png index e36d96c27..92d4d6ff2 100644 Binary files a/episodes/fig/pycharm-configuring-interpreter.png and b/episodes/fig/pycharm-configuring-interpreter.png differ diff --git a/episodes/fig/pycharm-find-panel.png b/episodes/fig/pycharm-find-panel.png index 26023c06e..2e46ff509 100644 Binary files a/episodes/fig/pycharm-find-panel.png and b/episodes/fig/pycharm-find-panel.png differ diff --git a/episodes/fig/pycharm-installed-packages.png b/episodes/fig/pycharm-installed-packages.png index 2940de150..bd03abf38 100644 Binary files a/episodes/fig/pycharm-installed-packages.png and b/episodes/fig/pycharm-installed-packages.png differ diff --git a/episodes/fig/pycharm-missing-python-interpreter.png b/episodes/fig/pycharm-missing-python-interpreter.png index 17673de22..d7c88abe9 100644 Binary files a/episodes/fig/pycharm-missing-python-interpreter.png and b/episodes/fig/pycharm-missing-python-interpreter.png differ diff --git a/episodes/fig/pycharm-open-project.png b/episodes/fig/pycharm-open-project.png index 49ded5759..75d28fc8a 100644 Binary files a/episodes/fig/pycharm-open-project.png and b/episodes/fig/pycharm-open-project.png differ diff --git a/episodes/fig/pycharm-run-configuration-popup.png b/episodes/fig/pycharm-run-configuration-popup.png index a51b7fccc..a6711bb5e 100644 Binary files a/episodes/fig/pycharm-run-configuration-popup.png and b/episodes/fig/pycharm-run-configuration-popup.png differ diff --git a/episodes/fig/pycharm-run-script.png b/episodes/fig/pycharm-run-script.png index 19a612c13..74e84a3f3 100644 Binary files a/episodes/fig/pycharm-run-script.png and b/episodes/fig/pycharm-run-script.png differ diff --git a/episodes/fig/pycharm-syntax-highlighting.png b/episodes/fig/pycharm-syntax-highlighting.png index 618929785..39b25bc06 100644 Binary files a/episodes/fig/pycharm-syntax-highlighting.png and b/episodes/fig/pycharm-syntax-highlighting.png differ diff --git a/episodes/fig/pycharm-test-framework.png b/episodes/fig/pycharm-test-framework.png index 4b43bb887..8f26edaae 100644 Binary files a/episodes/fig/pycharm-test-framework.png and b/episodes/fig/pycharm-test-framework.png differ diff --git a/episodes/fig/pycharm-version-control.png b/episodes/fig/pycharm-version-control.png index e805c1c55..cbdec2193 100644 Binary files a/episodes/fig/pycharm-version-control.png and b/episodes/fig/pycharm-version-control.png differ diff --git a/learners/software-architecture-extra.md b/learners/software-architecture-extra.md index 1a5d9c82c..923103efc 100644 --- a/learners/software-architecture-extra.md +++ b/learners/software-architecture-extra.md @@ -110,7 +110,7 @@ For example, the diagram below depicts the use of MVC architecture for the [DNA Guide Graphical User Interface application](https://www.software.ac.uk/developing-scientific-applications-using-model-view-controller-approach). ![](fig/mvc-DNA-guide-GUI.png){alt='MVC example of a DNA Guide Graphical User Interface application' .image-with-shadow width="400px" } -{% comment %}Image from endcomment %} +{% comment %}Image from {% endcomment %} ::::::::::::::::::::::::::::::::::::::: challenge diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 000000000..082619c80 --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,10 @@ +[project] +name = "python-intermediate-development" +version = "0.1.0" +requires-python = ">=3.13" +dependencies = [ + "mypy>=1.16.1", + "pyrefly>=0.20.2", + "pyright>=1.1.402", + "ty>=0.0.1a11", +]