diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index ef47f5b9357..5d03bf91d3e 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -664,6 +664,8 @@ peps/pep-0782.rst @vstinner peps/pep-0783.rst @hoodmane @ambv peps/pep-0784.rst @gpshead # ... +peps/pep-0787.rst @ncoghlan +# ... peps/pep-0789.rst @njsmith # ... peps/pep-0801.rst @warsaw diff --git a/peps/pep-0787.rst b/peps/pep-0787.rst new file mode 100644 index 00000000000..e5f7301397e --- /dev/null +++ b/peps/pep-0787.rst @@ -0,0 +1,264 @@ +PEP: 787 +Title: Safer subprocess usage using t-strings +Author: Nick Humrich , Alyssa Coghlan +Discussions-To: Pending +Status: Draft +Type: Standards Track +Requires: 750 +Created: 13-Apr-2025 +Python-Version: 3.14 + +Abstract +======== + +:pep:`750` introduced template strings (t-strings) as a generalization of f-strings, +providing a way to safely handle string interpolation in various contexts. This PEP +proposes extending the :mod:`subprocess` and :mod:`shlex` modules to natively support t-strings, enabling +safer and more ergonomic shell command execution with interpolated values, as well as +serving as a reference implementation for the t-string feature to improve API ergonomics. + +Motivation +========== + +Despite the safety benefits and flexibility that template strings offer in PEP 750, +they lack a concrete consumer implementation in the standard library that demonstrates +their practical application. One of the most compelling use cases for t-strings is safer +shell command execution, as noted in the withdrawn :pep:`501`: + +.. code-block:: python + + # Unsafe with f-strings: + os.system(f"echo {message_from_user}") + + # Also unsafe with f-strings + subprocess.run(f"echo {message_from_user}", shell=True) + + # Fails with f-strings + subprocess.run(f"echo {message_from_user}") + + # Safe with t-strings and POSIX-compliant shell quoting: + subprocess.run(t"echo {message_from_user}", shell=True) + + # Safe on all platforms with t-strings: + subprocess.run(t"echo {message_from_user}") + + # Safe on all platforms without t-strings: + subprocess.run(["echo", str(message_from_user)]) + +Currently, developers must choose between convenience (using f-strings with potential +security risks) and safety (using more verbose, list-based APIs). By adding native t-string +support to the :mod:`subprocess` module, we provide a consumer reference implementation that +demonstrates the value of t-strings while addressing a common security concern. + +Rationale +========= + +The subprocess module is an ideal candidate for t-string support for several reasons: + +* Command injection vulnerabilities in shell commands are a well-known security risk. +* The :mod:`subprocess` module already supports both string and list-based command specifications. +* There's a natural mapping between t-strings and proper shell escaping that provides both convenience and safety. +* It serves as a practical showcase for t-strings that developers can immediately understand and appreciate. + +By extending subprocess to handle t-strings natively, we make it easier to write secure code without sacrificing +the convenience that led many developers to use potentially unsafe f-strings. + +Specification +============= + +This PEP proposes two main additions to the standard library: + +#. A new ``sh()`` renderer function in the :mod:`shlex` module for safe shell command construction +#. Adding t-string support to the :mod:`subprocess` module's core functions, + particularly :class:`subprocess.Popen`, :func:`subprocess.run`, and other related functions + that accept a command argument + + +Renderer for shell escaping added to :mod:`shlex` +------------------------------------------------- + +As a reference implementation, a renderer for safe POSIX shell escaping will be added to +the :mod:`shlex` module. This renderer would be called ``sh`` and would be equivalent to +calling ``shlex.quote`` on each field value in the template literal. + +Thus:: + + os.system(shlex.sh(t"cat {myfile}")) + +would have the same behavior as:: + + os.system("cat " + shlex.quote(myfile))) + + +The addition of ``shlex.sh`` will NOT change the existing admonishments in the +:mod:`subprocess` documentation that passing ``shell=True`` is best avoided, nor the +reference from the :func:`os.system` documentation to the higher level ``subprocess`` APIs. + +The t-string processor implementation would look like:: + + from string.templatelib import Template, Interpolation + + def sh(template: Template) -> str: + parts: list[str] = [] + for item in template: + if isinstance(item, Interpolation): + # shlex.sh implementation, so shlex.quote can be used directly + parts.append(quote(str(item.value))) + else: + parts.append(item) + + # shlex.sh implementation, so `join` references shlex.join + return join(parts) + +This allows for explicit escaping of t-strings for shell usage:: + + import shlex + # Safe POSIX-compliant shell command construction + command = shlex.sh(t"cat {filename}") + os.system(command) + +Changes to subprocess module +---------------------------- + +With the additional renderer in the shlex module, and the addition of template strings, +the :mod:`subprocess` module can be changed to handle accepting template strings +as an additional input type to ``Popen``, as it already accepts a sequence, or a string, +with different behavior for each. In return, all :class:`subprocess.Popen` higher level +functions such as :func:`subprocess.run` could accept strings in a safe way +(on all systems for ``shell=False`` and on :ref:`POSIX systems ` for ``shell=True``). + +For example:: + + subprocess.run(t"cat {myfile}", shell=True) + +would automatically use the ``shlex.sh`` renderer provided in this PEP. Therefore, using +``shlex`` inside a ``subprocess.run`` call like so:: + + subprocess.run(shlex.sh(t"cat {myfile}"), shell=True) + +would be redundant, as ``run`` would automatically render any template literals +through ``shlex.sh`` + +When ``subprocess.Popen`` is called without ``shell=True``, t-string support would still +provide subprocess with a more ergonomic syntax. For example:: + + subprocess.run(t"cat {myfile} --flag {value}") + +would be equivalent to:: + + subprocess.run(["cat", myfile, "--flag", value]) + +or, more accurately:: + + subprocess.run(shlex.split(f"cat {shlex.quote(myfile)} --flag {shlex.quote(value)}")) + +It would do this by first using the ``shlex.sh`` renderer, as above, then using +``shlex.split`` on the result. + +The implementation inside ``subprocess.Popen._execute_child`` would check for t-strings:: + + from string.templatelib import Template + + if isinstance(args, Template): + import shlex + if shell: + args = shlex.sh(args) + else: + args = shlex.split(shlex.sh(args)) + +Backwards Compatibility +======================= + +This change is fully backwards compatible as it only adds new functionality without altering existing behavior. +The subprocess module will continue to handle strings and lists in the same way it currently does. + +Security Implications +===================== + +This PEP is explicitly designed to improve security by providing a safer alternative to using +f-strings with shell commands. By automatically applying appropriate escaping based on context +(shell or non-shell), it helps prevent command injection vulnerabilities. + +However, it's worth noting that when ``shell=True`` is used, the safety is limited to +POSIX-compliant shells. On Windows systems where cmd.exe or PowerShell may be used as the shell, +the escaping mechanism provided by :func:`shlex.quote` is not sufficient to prevent all forms +of command injection. + +How to Teach This +================= + +This feature can be taught as a natural extension of t-strings that demonstrates their practical value: + +1. Introduce the problem of command injection and why f-strings are dangerous with shell commands +2. Show the traditional solutions (list-based commands, manual escaping) +3. Introduce the ``shlex.sh`` renderer for explicit shell escaping:: + + # Unsafe: + os.system(f"cat {filename}") # Potential command injection! + + # Safe using shlex.sh: + os.system(shlex.sh(t"cat {filename}")) # Explicitly escaping for shell + +4. Introduce the subprocess module's native t-string support:: + + # Unsafe: + subprocess.run(f"cat {filename}", shell=True) # Potential command injection! + + # Safe but verbose: + subprocess.run(["cat", filename]) + + # Safe and readable with t-strings: + subprocess.run(t"cat {filename}", shell=True) # Automatically escapes filename + subprocess.run(t"cat {filename}") # Automatically converts to list form + +The implementation should be added to both the shlex and subprocess module documentation with clear +examples and security advisories. + +.. _pep-0787-defer-non-posix-shells: + +Deferring escaped rendering support for non-POSIX shells +-------------------------------------------------------- + +:func:`shlex.quote` works by classifying the regex character set ``[\w@%+=:,./-]`` to be +safe, deeming all other characters to be unsafe, and hence requiring quoting of the string +containing them. The quoting mechanism used is then specific to the way that string quoting +works in POSIX shells, so it cannot be trusted when running a shell that doesn't follow +POSIX shell string quoting rules. + +For example, running ``subprocess.run(f"echo {shlex.quote(sys.argv[1])}", shell=True)`` is +safe when using a shell that follows POSIX quoting rules: + +.. code-block:: console + + $ cat > run_quoted.py + import sys, shlex, subprocess + subprocess.run(f"echo {shlex.quote(sys.argv[1])}", shell=True) + $ python3 run_quoted.py pwd + pwd + $ python3 run_quoted.py '; pwd' + ; pwd + $ python3 run_quoted.py "'pwd'" + 'pwd' + +but remains unsafe when running a shell from Python invokes ``cmd.exe`` (or Powershell): + +.. code-block:: powershell + + S:\> echo import sys, shlex, subprocess > run_quoted.py + S:\> echo subprocess.run(f"echo {shlex.quote(sys.argv[1])}", shell=True) >> run_quoted.py + S:\> type run_quoted.py + import sys, shlex, subprocess + subprocess.run(f"echo {shlex.quote(sys.argv[1])}", shell=True) + S:\> python3 run_quoted.py "echo OK" + 'echo OK' + S:\> python3 run_quoted.py "'& echo Oh no!" + ''"'"' + Oh no!' + +Resolving this standard library limitation is beyond the scope of this PEP. + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive.