diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index d22fd902dc3..5c50eff6393 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -654,7 +654,7 @@ peps/pep-0772.rst @warsaw @pradyunsg peps/pep-0773.rst @zooba peps/pep-0774.rst @savannahostrowski peps/pep-0775.rst @encukou -# ... +peps/pep-0776.rst @hoodmane @ambv peps/pep-0777.rst @warsaw # ... peps/pep-0779.rst @Yhg1s @colesbury @mpage diff --git a/peps/pep-0776.rst b/peps/pep-0776.rst new file mode 100644 index 00000000000..0cf88b63507 --- /dev/null +++ b/peps/pep-0776.rst @@ -0,0 +1,807 @@ +PEP: 776 +Title: Emscripten Support +Author: Hood Chatham +Sponsor: Ɓukasz Langa +Discussions-To: https://discuss.python.org/t/84996 +Status: Draft +Type: Standards Track +Topic: Packaging +Created: 18-Mar-2025 +Python-Version: 3.14 +Post-History: `18-Mar-2025 `__ + +Abstract +======== + +`Emscripten `__ is a complete open source compiler +toolchain. It compiles C/C++ code into WebAssembly/JavaScript executables, for +use in JavaScript runtimes, including browsers and Node.js. The Rust language +also maintains an Emscripten target. + +This PEP formalizes the addition of Tier 3 for Emscripten support in Python 3.14 +which `was approved by the Steering Council on October 25, 2024 +`__. The goal is to allow +Pyodide wheels to be uploaded to PyPI. + +Motivation +========== + +A web browser is a universal computing platform, available on Windows, macOS, +Linux, and every smartphone. + +`The Pyodide project `__ has supported Emscripten Python +since 2018. Hundreds of thousands of students have learned Python through +Pyodide via projects like `Capytale +`__ +and `PyodideU `__. Pyodide +is also increasingly being used by Python packages to provide interactive +documentation. This demonstrates both the importance and the maturity of the +Emscripten platform. + +Emscripten and WASI are also the only supported platforms that offer any +meaningful sandboxing. + +Goals +===== + +It is our long term goal to upstream the entire Pyodide runtime into CPython, +but this is out of scope for the present PEP. This PEP only attempts to +establish the foundation for future work. + +Runtime Goals +------------- + +1. To describe the current state of the CPython Emscripten runtime +2. To describe the current state of the Pyodide runtime +3. To identify minor features to be upstreamed from the Pyodide runtime into the + CPython Emscripten runtime + +The minor features identified here are all features that could be implemented +without a PEP. We discuss more significant runtime features that we would like +to implement but we defer decisions on those features to subsequent PEPs. + +Packaging Goals +--------------- + +1. To describe Pyodide's packaging tooling +2. To describe Pyodide's wheel abi to a similar level of detail as the manylinux + PEPs +3. For Pyodide wheels to be allowed for upload to PyPI + + +Emscripten Platform Information +=============================== + +Background on Emscripten +------------------------ + +`Emscripten +`__ +consists of a C and C++ compiler and linker based on LLVM__, together with a +runtime based on a mildly patched musl libc. + +__ https://llvm.org/ + +Emscripten is a POSIX-based platform. It uses the `WebAssembly binary format`_, +and the `WebAssembly dynamic linking section`_. + +.. _WebAssembly binary format: https://webassembly.github.io/spec/core/binary/index.html +.. _WebAssembly dynamic linking section: https://github.com/WebAssembly/tool-conventions/blob/main/DynamicLinking.md + +The ``emcc`` compiler is a wrapper around ``clang``. The ``emcc`` linker is a +wrapper around ``wasm-ld`` (`also part of the LLVM toolchain +`__). + +Emscripten support for portable C/C++ code source compatibility with Linux is +fairly comprehensive, with certain expected exceptions to be spelled out. +CPython already supports compilation to Emscripten, and it only requires a very +modest number of modifications to the normal Linux target. + +POSIX Compliance +---------------- + +Emscripten is a POSIX platform. However, there are POSIX APIs that exist but +always fail when called and POSIX APIs that don't exist at all. In particular, +there are problems with networking APIs and blocking I/O, and there is no +support for ``fork()``. See `Emscripten Portability Guidelines +`__. + +Emscripten executables can be linked with threading support, but it comes +with several limitations: + +* Enabling threading requires websites to be served with special security headers + that indicate acceptance of the possibility of Spectre_-style information + leakage. These headers are a usability hazard for users who are not intimately + familiar with the web platform. + + .. _Spectre: https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) + +* If an executable is linked with both threading and a dynamic loader, Emscripten + prints a warning that using dynamic loading and pthreads together is + experimental. It may cause performance problems or crashes. These problems may + require WebAssembly standards work to resolve. + +Because of these limitations, Pyodide standardizes a no-pthreads build of +Python. If there is sufficient demand, a pthreads build with no dynamic loader +could be added later. + +Emscripten ABI Compatibility +---------------------------- + +The Emscripten compiler has no ABI stability guarantees between versions. Many +Emscripten updates are ABI compatible by chance, and the Rust Emscripten target +behaves as if the ABI were stable with only `occasional negative consequences +`__. + +There are several linker flags that adjust the Emscripten ABI, so Python +packages built to run with Emscripten must make sure to match the ABI-sensitive +linker flags used to compile the interpreter to avoid load-time or run-time +errors. The Emscripten compiler continuously fixes bugs and adds support for new +web platform features. Thus, there is significant benefit to being able to +update the ABI. + +In order to balance the ABI stability needs of package maintainers with the ABI +flexibility to allow the platform to move forward, Pyodide plans to adopt a new +ABI for each feature release of Python. + +The Pyodide team also coordinates the ABI flags that Pyodide uses with the +Emscripten ABI that Rust supports in order to ensure that we have support for +the many popular Rust packages. Historically, most of the work for this has +been related to unwinding ABIs. See for instance `this Rust Major Change +Proposal `__. + +Development Tools +----------------- + +Emscripten development tools are equally well supported on Linux, Windows, and +macOS. The upstream tools include: + +* The Emscripten Software Developer Kit (:program:`emsdk`) which can be used to + install the Emscripten compiler toolchain (:program:`emcc`). +* :program:`emcc` is a C and C++ compiler, linker, and a sysroot with headers + for the system libraries. The system libraries themselves are generated on + the fly based on the ABI requested. +* Node.js can be used as an "emulator" to run Emscripten programs from the + command line. This emulation behaves best on Linux with macOS as a runner up. + Node.js is the most convenient way to test Emscripten programs. +* It is possible to run Emscripten programs inside of any web browser. Browser + automation tools like Selenium, Playwright, or Puppeteer can be used to test + features that are browser-only. + +Pyodide's tools: + +* ``pyodide build`` can be used to cross compile Python packages to run on + Emscripten. Cross compilation works best on Linux, there is experimental + support on macOS, and it is entirely unsupported on Windows. +* ``pyodide venv`` can make a virtual environment that runs in Pyodide. +* ``pytest-pyodide`` can test Python code against various JavaScript runtimes. + +cibuildwheel__ supports building wheels to target Emscripten using ``pyodide build``. + +__ https://cibuildwheel.pypa.io/ + +In the short term, Pyodide's packaging tooling will stay in the Pyodide +repository. It is an open question where Pyodide's packaging tooling should live +in the long term. Two sensible options would be for it to remain under the +``pyodide`` organization or be moved into the ``pypa`` organization on GitHub. + + +Emscripten Application Lifecycle +-------------------------------- + +An Emscripten "binary" consists of a pair of files, an ``.mjs`` file and a +``.wasm`` file. The ``.wasm`` file contains all of the compiled C/C++/Rust code. +The ``.mjs`` file contains the lifecycle code to set up the runtime, locate the +``.wasm`` file, compile it, instantiate it, call the ``main()`` function, and to +shut down the runtime on exit. It also includes an implementation for all of the +system calls, including the file system, the dynamic loader, and any logic to +expose additional functionality from the JavaScript runtime to C code. + +The ``.mjs`` file exports a single ``bootstrapEmscriptenExecutable()`` +JavaScript function that bootstraps the runtime, calls the ``main()`` function, +and returns an API object that can be used to call C functions. Each time it is +called produces a complete and independent copy of the runtime with its own +separate address space. + +The ``bootstrapEmscriptenExecutable()`` takes a large number of runtime +settings. `The full list is described in the Emscripten documentation here. +`__ The most +important of these are as follows: + +* ``thisProgram``: The value of ``argv[0]``. In Python, this makes its way into + ``sys.executable``. +* ``arguments``: The list of string arguments to be passed to ``main()``. +* ``preRun``: A list of callbacks which are invoked after the JavaScript runtime + and file system have been bootstrapped but before calling ``main()``. Useful + to set up the file system, environment variables, and standard streams. +* ``print`` / ``printErr`` : Initial handlers for stdout and stderr. They are + line buffered and performing a ``flush()`` of a partial line forces an extra + new line. If tty-like behavior is desired, the standard stream devices should + be replaced in a ``preRun()`` hook. +* ``onExit``: A handler that is called when the runtime exits. +* ``instantiateWasm``: A callback that is called to instantiate the WebAssembly + module. Overriding the WebAssembly instantiation procedure via this function + is useful when you have other custom asynchronous startup actions or downloads + that can be performed in parallel to WebAssembly compilation. Implementing + this callback allows performing all of these in parallel. + +File System Setup +----------------- + +The Standard Library +~~~~~~~~~~~~~~~~~~~~ + + +In order for Python to run, it needs access to the standard library in the +Emscripten file system. There are several possible approaches to this: + +* The Emscripten linker has a ``--preload-file`` flag that will automatically + handle loading files. `Information about how it works is available here. + `__ + This is the simplest approach, but Pyodide has moved away from it because it + embeds the files into a custom archive format that cannot be processed with + standard tooling. + +* For Node.js, use the NODEFS to mount a native directory with the files into the + Emscripten file system. This is the most efficient option but is Node only. It + is closely analogous to what WASI_ does. + + .. _WASI: https://wasi.dev/ + +* Put the standard library into a zip archive and use ``ZipImporter``. Using an + uncompressed zip file allows the web server and client to apply better + compression to the standard library itself. It also uses the more efficient + native decompression algorithms of the browser rather than less efficient + WebAssembly decompression. The disadvantages of this are a higher memory + footprint and breaking :py:mod:`inspect` & various tests that do not expect the + standard library to be packaged in this way. + +* Put the standard library into an uncompressed tar archive and mount it into a + TARFS read only file system backed by the tar file. This has the best memory + usage, runtime performance, and transfer size of the options that can be used + in the browser. The disadvantage is that Emscripten does not itself include a + TARFS so it requires a downstream implementation. + +Pyodide uses the ``ZipImporter`` approach in every runtime. Python uses the +NODEFS approach when run with node and the ``ZipImporter`` approach for the web +example. We will continue with this approach. + +The ``ZipImporter`` provides a clean resolution for a bootstrapping problem: the +Python runtime is capable of unpacking a wide variety of archive formats, but +the Python runtime is not ready to use until the standard library is already +available. Since ``zipimport.py`` is a frozen module, it avoids these problems. +All of the other approaches solve the bootstrapping problem by setting up the +standard library using JavaScript. + +Third-party packages +~~~~~~~~~~~~~~~~~~~~ + +It is also necessary to make any needed packages available in the Emscripten +file system. Currently Emscripten CPython has no support for packages. Pyodide +uses two different approaches for packages: + +* In the browser, Pyodide downloads and unpacks wheels into the MEMFS + site-packages directory. It then preloads all dynamic libraries in the wheel. + The work of downloading and installing all the packages is redone every time + the runtime starts. + +* The Pyodide ``python`` CLI entrypoint mounts all of the host file system as + NODEFS directories before it bootstraps Python. This allows the normal virtual + environment mechanism to work. Pyodide virtual environments contain a patched + copy of pip and a custom ``pip.conf`` so that pip will install Pyodide wheels. + On startup the Pyodide ``python`` CLI will preload all Emscripten dynamic + libraries that are in the site-packages directory. + + +Console and Interactive Usage +----------------------------- + +``stdin`` defaults to always returning ``EOF``, while ``stdout`` and ``stderr`` +default to calling ``console.log`` and ``console.error`` respectively. It is +possible to pass handlers to ``bootstrapEmscriptenExecutable()`` to configure +the standard streams, but no matter what the I/O devices have undesirable line +buffering behavior that forces a new line when flushed. To implement a well +behaved TTY in-browser, it is necessary to remove the default I/O devices and +replace them in a ``preRun`` hook. + +Making ``stdin`` work correctly in the browser poses an additional challenge +because it is not allowed to block for user input in the main thread of the +browser. If Emscripten is run in a web worker and served with the shared memory +headers, it is possible to receive input using shared memory and atomics. It is +also possible for a ``stdin`` device to block in a simpler and more efficient +manner using stack switching using the experimental JavaScript Promise +Integration API. + +Pyodide replaces the standard I/O devices in order to fix the line buffering +behavior. When Pyodide is run in Node.js, ``stdin``, ``stdout``, and ``stderr`` are +by default connected to ``process.stdin``, ``process.stdout``, and +``process.stderr`` and so the standard streams work as a tty out of the box. +Pyodide also ensures that ``shutil.get_terminal_size`` returns results +consistent with ``process.stdout.rows`` and ``process.stdout.columns``. Pyodide +currently has no support for stack switching ``stdin``. + +Currently, the Emscripten Python Node.js runner uses the default I/O that +Emscripten provides. The web example uses ``Atomics`` for ``stdin`` and has +custom ``stdout`` and ``stderr`` handlers, but they exhibit the undesirable line +buffering behavior. We will upstream the standard streams behaviors from +Pyodide. + +In the long term, we hope to implement stack switching ``stdin`` devices, but +that is out of scope for this PEP. + + +Dynamic Libraries +----------------- + +Main Thread Synchronous Loading Limit +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In the main browser thread, a dynamic library can only be loaded synchronously +if it is at most 4 kilobytes. This excludes most nontrivial dynamic libraries. +This limit is not present in Node.js and can be avoided by using a web worker. If +stack switching is available, then it is possible to make ``dlopen()`` stack +switch in order to instantiate a dynamic library synchronously. + +To avoid the synchronous loading limit, Pyodide currently preloads all dynamic +libraries present in a wheel when installing the wheel (or on startup). This is +a significant disadvantage with packages like SciPy that include a very large +number of shared libraries that are expected to be only loaded on demand. +Pyodide will implement a solution based on stack switching as it becomes more +widely available in runtimes. + +Emscripten Python only loads extension module dynamic libraries when they are +imported. This approach is simpler and more efficient when it works. The web +example runs in a web worker and the cli runner runs in Node so neither of these +have the synchronous loading limit. We will continue with this approach in +Emscripten Python. + +In the long run, we hope to implement a stack switching ``dlopen``, but that is +out of scope for this PEP. + +Missing RPATH Support +~~~~~~~~~~~~~~~~~~~~~ + +Another important limitation of the Emscripten dynamic loader is that it does +not currently have RPATH support. Pyodide's present workaround is as follows: +``auditwheel-emscripten`` places shared library dependencies that are vendored +into a package in a ``${package}.libs`` folder, following auditwheel's +convention. Pyodide patches the dynamic loader to treat this ``${package}.libs`` +folder as if it were on the RPATH of all of the dynamic libraries in the wheel. + +In Emscripten 4.0.5, we have updated the shared object file format, ``wasm-ld`` +and ``emcc`` to accept an ``-rpath`` flag. We are still working on updating the +dynamic loader to respect the rpath, but we expect this will be finished in the +next Emscripten release. Pyodide will then switch to using the RPATH and drop +the patch on the dynamic loader. + +Emscripten Python currently uses the unpatched dynamic loader and so cannot load +extension modules that depend on vendored dynamic libraries via DT_NEEDED. +Extension modules can load dynamic libraries via DT_NEEDED if they are in the +system ``lib`` directory. We will wait to resolve this until we have fixed the +Emscripten dynamic loader upstream. When Emscripten Python is built with a +compatible version of Emscripten, it will automatically pick up support for +wheels with vendored dynamic libraries. + + +Traps and Uncaught Exceptions +----------------------------- + +We consider the C runtime state to be corrupted if there is a WebAssembly trap, +an unhandled JavaScript exception, or an uncaught WebAssembly throw instruction. + +Unlike in other platforms, there is no operating system to shut down the +executable when there is a trap or other unrecoverable corruption of the libc +runtime. We need to provide our own code to print tracebacks, dump the memory, +or do whatever else is helpful for debugging a crash. If we expose a JavaScript +API, we also must ensure that it is disabled after an unrecoverable crash to +prevent downstream users from observing the Python runtime in an inconsistent +state. + +In order to detect fatal errors, Pyodide uses the following approach: all +fallable calls from WebAssembly into JavaScript are wrapped with a JavaScript +try/catch block. Any caught JavaScript exceptions are translated into Python +exceptions. This ensures that any recoverable JavaScript error is caught before +it unwinds through any WebAssembly frames. All entrypoints to WebAssembly are +also wrapped with JavaScript try/catch blocks. Any exceptions caught there have +unwound WebAssembly frames and are thus considered to be fatal errors (though +there is a special case to handle :func:`~sys.exit`). This requires foundational +integration with the Python/JavaScript foreign function interface. + +When the Pyodide runtime catches a fatal exception, it introspects the error to +determine whether it came from a trap, a logic error in a system call, a +``setjmp()`` without a ``longjmp()``, or a libcxxabi call to ``__cxa_throw()`` +(an uncaught C++ exception or Rust panic). We render as informative an error +message as we can. We also call ``_Py_DumpTraceback()`` so we can display a +Python traceback in addition to the JS/WebAssembly traceback. It also disables +the JavaScript API so that further attempts to call into Python result in an +error saying that the runtime has fatally failed. + +Normally, WebAssembly symbols are stripped so the WebAssembly frames are not +very useful. Compiling and linking with ``-g2`` (or a higher debug setting) +ensures that WebAssembly symbols are included and they will appear in the +traceback. + +Because Emscripten Python currently has no JavaScript API and no foreign function +interface, the situation is much simpler. The Python Node.js runner wraps the call +to ``bootstrapEmscriptenExecutable()`` in a try/catch block. If an exception is +caught, it displays the JavaScript exception and calls ``_Py_DumpTraceback()``. +It then exits with code 1. We will stick with this approach until we add either +a JavaScript API or foreign function interface, which is out of scope for this PEP. + +Specification +============= + +Scope of Work +------------- + +Adding Emscripten as a Tier 3 platform only requires adding support for +compiling an Emscripten-compatible build from the unpatched CPython source code. +It does not necessarily require there to be any officially distributed +Emscripten artifacts on python.org, although these could be added in the future. +In the short term, they will continue to be distributed downstream with Pyodide. + +Emscripten will be built using the same configure and Makefile system as other +POSIX platforms, and must therefore be built on a POSIX platform. Both Linux and +macOS will be supported. + +A Python CLI entrypoint will be provided, which among other things can be used +to execute the test suite. + +Linkage +------- + +It is only supported to statically link the Python interpreter. We use `EM_JS +`__ +functions in the interpreter for various purposes. It is possible to dynamically +link object files that include ``EM_JS`` functions, but their behavior deviates +significantly from their behavior in static builds. For this reason, it would +require special work to support. If a use case for dynamically linking the +interpreter in Emscripten emerges, we can evaluate how much effort would be +required to support it. + +Standard Library +---------------- + +Unsupported Modules +~~~~~~~~~~~~~~~~~~~ + +See https://pyodide.org/en/stable/usage/wasm-constraints.html#removed-modules. + +Removed Modules +^^^^^^^^^^^^^^^ + +The following modules are removed from the standard library to reduce download +size and since they currently wouldn't work in the WebAssembly VM. + +- curses +- dbm +- ensurepip +- fcntl +- grp +- idlelib +- msvcrt +- pwd +- resource +- syslog +- termios +- tkinter +- turtle +- turtledemo +- venv +- winreg +- winsound + +Included but not Working Modules +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following modules can be imported, but are not functional: + +- multiprocessing +- threading +- sockets + +as well as any functionality that requires these. + +The following are present but cannot be imported due to a dependency on the +termios package which has been removed: + +- pty +- tty + + +Platform Identification +~~~~~~~~~~~~~~~~~~~~~~~ + +``sys.platform`` will return ``"emscripten"``. Although Emscripten attempts to +be compatible with Linux, the differences are significant enough that a distinct +name is justified. This is consistent with the return value from ``os.uname()``. + +There is also ``sys._emscripten_info`` which includes the Emscripten version and +the runtime (either ``navigator.userAgent`` in a browser or ``"Node js" + +process.version`` in Node.js). + +Signals Support +--------------- + +WebAssembly does not have native support for signals. Furthermore, on a +non-pthreads build, the address space of the WebAssembly module is not shared, +so it is impossible for any thread capable of seeing an interrupt to write to +the eval breaker while the Python interpreter is running code. To work around +this, there are two possible solutions: + +* If Emscripten is run in a web worker and served with the shared memory headers, + it is possible to use shared memory outside of the WebAssembly address space + as a signal buffer. A signal handling UI thread can write the desired signal + into the signal buffer. The interpreter can periodically check the state of + this signal buffer in the eval breaker code. Checking the signal buffer is + slow compared to checking the eval breaker in native platforms, so we do only + do it once every 50 times through the eval breaker. See + `Python/emscripten_signal.c + `__ +* Using stack switching, we can occasionally switch the stack and allow the + JavaScript event loop to go around, then check the state of a signal buffer. + This requires the experimental JavaScript Promise Integration API, and would + be best used with the techniques for optimizing long tasks described `in this + article `__ + +Emscripten Python has already implemented the solution based on shared memory, +and it is in use in Pyodide. + +Eventually, we hope to implement stack-switching-based signals so that it is +possible to use signals in the main thread of node and the browser, as well as +in in web pages that are not served with the shared memory headers. We will need +to keep the shared memory based approach as well, both for backwards +compatibility and because it is more efficient when it is possible. However, +this is out of scope for this PEP. + + +Function Pointer Casts +---------------------- + +`Section 6.3.2.3, paragraph 8 +`__ of the C +standard reads: + + A pointer to a function of one type may be converted to a pointer to a + function of another type and back again; the result shall compare equal to + the original pointer. If a converted pointer is used to call a function + whose type is not compatible with the pointed-to type, the behavior is + undefined. + +However, most platforms have the same behavior: if a function is called with too +many arguments, the extra arguments are ignored; if a function is called with +too few arguments, the extra arguments are filled in with garbage. + +On the other hand, the WebAssembly spec defines calling a function with the +wrong signature to trap (`see step 18 in the execution of call_indirect +`__. + +It is common for Python extension modules to cast a function to a different +signature and call it with the different signature. For instance, many C +extensions define a ``METH_NOARGS`` function to take 0 or 1 argument. The +interpreter calls it with two arguments, the first of which is the Python module +object and the second of which is always ``NULL``. In order to make these +extension modules work without changing their source code, we need special +handling. + +Initially, we resolved this problem by calling out to JavaScript and having +JavaScript call the function pointer. When calling a WebAssembly function from +JavaScript, missing arguments are treated as zero and extra arguments are +ignored (`see step 7 here +`__. +This works, but has the disadvantage of being slow and breaking stack switching +-- it is not possible to stack switch through JavaScript frames. + +Using the wasm-gc `ref.test +`__ +instruction, we can query the type of the function pointer and manually fix up +the argument list. + +wasm-gc is a relatively new feature for WebAssembly runtimes, so we attempt to +use a wasm-gc based function pointer cast trampoline if possible and fall back +to a JS trampoline if not. Every JavaScript runtime that supports stack +switching also supports wasm-gc, so this ensures that stack switching works on +every platform runtime that supports it. The one wrinkle is that iOS 18 ships a +broken implementation of wasm-gc so we have to special case it. + +`See here for the full implementation details. +`__ + +The function pointer cast handling is fully implemented in cpython. Pyodide uses +exactly the same code as upstream. + + +CI Resources +------------ + +Pyodide can be built and tested on any Linux with a reasonably recent version of +Node.js. Anaconda has offered to provide physical hardware to run Emscripten +buildbots, maintained by Russell Keith-Magee. + +CPython does not currently test Tier 3 platforms on GitHub Actions, but if this +ever changes, their Linux runners are able to build and test Emscripten Python. + +Packaging +--------- + +Existing Package Support +~~~~~~~~~~~~~~~~~~~~~~~~ + +Pyodide currently maintains ports of 255 different packages at the time of this +writing, including major scientific Python packages like NumPy, SciPy, pandas, +Polars, scikit-learn, OpenCV, PyArrow, and Pillow as well as general purpose +packages like aiohttp, Requests, Pydantic, cryptography, and orjson. + +About 60 packages are also testing against Pyodide in their CI, including NumPy, +pandas, awkward-cpp, scikit-image, statsmodels, PyArrow, Hypothesis, and PyO3. + +Emscripten Wheel Format +~~~~~~~~~~~~~~~~~~~~~~~ + +Emscripten wheels will use either the format ``emscripten__wasm32`` or +``pyodide__wasm32``. For example: + +* ``emscripten_3_1_58_wasm32`` +* ``pyodide_2025_0_wasm32`` + +The first triple is ambiguous, since even with Emscripten 3.1.58 it is possible +to link dynamic libraries that require a large number of distinct ABIs, +depending on linker and compiler options. It is our intent that the +``pyodide_2025_0`` specifies the particular ABI. Thus, the relationship between +``pyodide_`` and ``emscripten_`` is intended to be the same as the +relationship between ``manylinux`` and ``linux``. + +The specification of the ``pyodide_`` ABI includes: + +* Which version of the Emscripten compiler is used +* What libraries are statically linked with the interpreter +* What stack unwinding ABI is to be used +* Which runtime platform features are required to be present + +and a handful of other similar details that affect the ABI. + +The ABI is selected by choosing the appropriate version of the Emscripten +compiler and passing appropriate compiler and linker flags. It is possible for +other people to build their own Python interpreter that is compatible with the +Pyodide ABI, it is not necessary to use the Pyodide distribution itself. + +The ``pyodide build`` tool knows how to create wheels that match our ABI. As an +alternative, +`the auditwheel-emscripten tool `__ +is capable of performing basic compatibility checks, vendoring shared libraries, +and retagging the wheel from ``emscripten_`` to ``pyodide_``. Unlike +with manylinux, there is no need for a Docker container to build the +``pyodide_`` wheels. All that is needed is a Linux machine and appropriate +versions of Python, Node.js, and Emscripten. + + +PEP 11 +------ + +:pep:`11` will be updated to indicate that Emscripten is supported, specifically +the triples ``wasm32-unknown-emscripten_xx_xx_xx``. + +Russell Keith-Magee will serve as the initial core team contact for these ABIs. + + +Future Work +=========== + +Improving Cross Builds in the Packaging Ecosystem +------------------------------------------------- + +Python now supports four non-self-hosting platforms: iOS, Android, WASI, and +Emscripten. All of them will need to build packages via cross builds. Currently, +``pyodide-build`` allows building a very large number of Python packages for +Emscripten, but it is very complicated. Ideally, the Python packaging ecosystem +would have standards for cross builds. This is a difficult long term project, +particularly because the packaging system is complex and was designed from the +ground up with the assumption that cross compilation would not happen. + + +Pyodide Runtime Features to be Upstreamed +----------------------------------------- + +This is a collection of Pyodide runtime features that are out of scope for this +PEP and for the Python 3.14 development cycle but we would like to upstream in +the future. + +JavaScript API for Bootstrapping +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Currently we offer no stable API for bootstrapping Python. Instead, we use `one +collection of settings for the Node.js CLI entrypoint +`__ +and `a separate collection of settings for the browser demo +`__. + +The Emscripten executable startup API is complicated and there are many possible +configurations that are broken. Pyodide offers a simpler set of options than +Emscripten. This gives downstream users a lot of flexibility while allowing us +to maintain a small number of tested configurations. It also reduces downstream +code duplication. + +Eventually, we would like to upstream Pyodide's bootstrapping API. In the short +term, to keep things simple we will support no JavaScript API. + +JavaScript foreign function interface (FFI) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Because Emscripten supports POSIX, a significant number of tasks can be achieved +using the ``os`` module. However, many fundamental operations in JavaScript +runtimes are not possible via POSIX APIs. Pyodide's approach is to specify a +mapping between the JavaScript object model and the Python object model and a +calling convention that allows high level bidirectional integration. + +Asyncio +~~~~~~~ + +Most JavaScript primitives are asynchronous. The JavaScript thread that Python +runs in already has an event loop. It it not too difficult to implement a Python +event loop that defers all actual work to the JavaScript event loop, +`implemented in Pyodide here `__. + +This is logically dependent on having at least some limited JavaScript FFI +because the only way to schedule tasks on the JavaScript event loop is via a +call out to JavaScript. + +One cause of incompatibility is that it is not possible to control the life +cycle of the event loop from within a JavaScript isolate. This makes +``asyncio.run()`` and similar things not work. + +Using stack switching it is also possible to make a coroutine out of +"synchronous" Python frames. These stack switching coroutines are scheduled on +the same event loop as ordinary Python coroutines and are fully reentrant. This +is fully implemented in Pyodide. + + +Backwards Compatibility +======================= + +Adding a new platform does not introduce any backwards compatibility concerns to +CPython itself. However, there may be some backwards compatibility implications +on Pyodide users. There are a large number of existing users of Pyodide, so it +is important when upstreaming features from Pyodide into Python that we take +care to minimize backwards incompatibility. We will also need a way to disable +partially-upstreamed features so that Pyodide can replace them with more +complete versions downstream. + +These backwards compatibility concerns impact not just the runtime but also the +packaging system. + + +Security Implications +===================== + +Adding a new platform does not add any new security implications. + +Emscripten and WASI are also the only supported platforms that offer sandboxing. +If users wish to execute untrusted Python code or untrusted Python extension +modules, Emscripten provides a secure way for them to do that. + +How to Teach This +================= + +The education needs related to this PEP relate to two groups of developers. + +First, web developers will need to know how to build Python and use it in a +website, along with their own Python code and any supporting packages, and how +to use them all at runtime. The documentation will cover this in a similar form +to the existing Windows embeddable package. In the short term, we will encourage +developers to use Pyodide if at all possible. + +Second, developers of packages with binary components need to know how to build +and release them for Emscripten (see Packaging). + + +Reference Implementation +======================== + +Pyodide. + + +Copyright +========= + +This document is placed in the public domain or under the CC0-1.0-Universal +license, whichever is more permissive.