From 7c336ac8dacf7775179e315a2512ce9633c1622b Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Tue, 30 Sep 2025 20:39:00 +0200 Subject: [PATCH 01/13] Doc/library/xml.rst: Improve section on XML security Clarify that: - it takes parsing for an attack - that some doors are closed by default - only version 2.7.2 has all the fixes - use of the bundle depends on configuration --- Doc/library/xml.rst | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index 3f745573474405..6054f2567c7bf2 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -53,11 +53,21 @@ XML security An attacker can abuse XML features to carry out denial of service attacks, access local files, generate network connections to other machines, or -circumvent firewalls. +circumvent firewalls when attacker-controlled XML is being parsed, +in Python or elsewhere. -Expat versions lower than 2.6.0 may be vulnerable to "billion laughs", -"quadratic blowup" and "large tokens". Python may be vulnerable if it uses such -older versions of Expat as a system-provided library. +The builtin XML parsers of Python rely on library `libexpat`_, commonly +called Expat, for parsing XML. + +By default, Expat itself does not access local files or create network +connections. + +Expat versions lower than 2.7.2 may be vulnerable to "billion laughs", +"quadratic blowup" and "large tokens" or disproportional use of dynamic memory. +Python bundles a copy of Expat, and whether the bundled or a system-wide Expat +is being used by Python, depends on how the Python interpreter +:doc:`has been configured <../using/configure>` in your environment. +Python may be vulnerable if it uses such older versions of Expat. Check :const:`!pyexpat.EXPAT_VERSION`. :mod:`xmlrpc` is **vulnerable** to the "decompression bomb" attack. @@ -90,5 +100,6 @@ large tokens be used to cause denial of service in the application parsing XML. The issue is known as :cve:`2023-52425`. +.. _libexpat: https://github.com/libexpat/libexpat .. _Billion Laughs: https://en.wikipedia.org/wiki/Billion_laughs .. _ZIP bomb: https://en.wikipedia.org/wiki/Zip_bomb From adb0de6df5f9cea793f5c9c11a33f6654aa59355 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Tue, 30 Sep 2025 20:49:37 +0200 Subject: [PATCH 02/13] Doc/library/pyexpat.rst: Document risk of ExternalEntityRefHandler --- Doc/library/pyexpat.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 9aae5c9da7471d..3d9ae9f326ac6d 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -614,6 +614,13 @@ otherwise stated. .. method:: xmlparser.ExternalEntityRefHandler(context, base, systemId, publicId) + .. warning:: + + Registering a handler for external entity references may allow + attacker-controller XML to access local files and/or the network, + and thus create new security risks. + By default, :class:`xmlparser` is safe from these threats. + Called for references to external entities. *base* is the current base, as set by a previous call to :meth:`SetBase`. The public and system identifiers, *systemId* and *publicId*, are strings if given; if the public identifier is not From fb47ad6bab1d6a5c83b99b3f647e0947d410e0d0 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Tue, 30 Sep 2025 20:57:28 +0200 Subject: [PATCH 03/13] Add news item --- .../2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst | 4 ++++ 1 file changed, 4 insertions(+) create mode 100644 Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst diff --git a/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst b/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst new file mode 100644 index 00000000000000..bbf89cc0d17c3d --- /dev/null +++ b/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst @@ -0,0 +1,4 @@ +Improve documentation on :doc:´XML security +<../library/xml#xml-vulnerabilities>` and method +:meth:`~xml.parsers.expat.xmlparser.ExternalEntityRefHandler`. +Patch by Sebastian Pipping. From d89f9910be3affc86cdaa6e01bbe9921a24b2fe9 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Tue, 30 Sep 2025 21:04:18 +0200 Subject: [PATCH 04/13] Fix news item syntax --- .../2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst b/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst index bbf89cc0d17c3d..b61e9a4c3672e0 100644 --- a/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst +++ b/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst @@ -1,4 +1,5 @@ -Improve documentation on :doc:´XML security -<../library/xml#xml-vulnerabilities>` and method +Improve documentation on +:doc:`XML security <../library/xml#xml-vulnerabilities>` +and method :meth:`~xml.parsers.expat.xmlparser.ExternalEntityRefHandler`. Patch by Sebastian Pipping. From 10b298889924f1a1e786aca5c1be1aa4c58129f4 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Tue, 30 Sep 2025 21:09:09 +0200 Subject: [PATCH 05/13] Drop troublesome HTML anchor from news item --- .../2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst b/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst index b61e9a4c3672e0..e834eb67a55f51 100644 --- a/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst +++ b/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst @@ -1,5 +1,5 @@ Improve documentation on -:doc:`XML security <../library/xml#xml-vulnerabilities>` +:doc:`XML security <../library/xml>` and method :meth:`~xml.parsers.expat.xmlparser.ExternalEntityRefHandler`. Patch by Sebastian Pipping. From c8d9afb79395e46ebccd25584f59bbcec4725c2e Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Tue, 14 Oct 2025 15:20:51 +0200 Subject: [PATCH 06/13] Try to point to configure argument --with-system-expat directly General idea by @vstinner --- Doc/library/xml.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index 6054f2567c7bf2..7572157c0287f8 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -66,7 +66,7 @@ Expat versions lower than 2.7.2 may be vulnerable to "billion laughs", "quadratic blowup" and "large tokens" or disproportional use of dynamic memory. Python bundles a copy of Expat, and whether the bundled or a system-wide Expat is being used by Python, depends on how the Python interpreter -:doc:`has been configured <../using/configure>` in your environment. +:option:`has been configured <--with-system-expat>` in your environment. Python may be vulnerable if it uses such older versions of Expat. Check :const:`!pyexpat.EXPAT_VERSION`. From 5d935bd3ebc3c91dfccfac52b58eab49e7e718da Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Mon, 20 Oct 2025 15:07:28 +0200 Subject: [PATCH 07/13] Improve language Ideas by @hedsnz --- Doc/library/xml.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index 7572157c0287f8..3b5295632de4f8 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -56,14 +56,15 @@ access local files, generate network connections to other machines, or circumvent firewalls when attacker-controlled XML is being parsed, in Python or elsewhere. -The builtin XML parsers of Python rely on library `libexpat`_, commonly +The built-in XML parsers of Python rely on the library `libexpat`_, commonly called Expat, for parsing XML. By default, Expat itself does not access local files or create network connections. -Expat versions lower than 2.7.2 may be vulnerable to "billion laughs", -"quadratic blowup" and "large tokens" or disproportional use of dynamic memory. +Expat versions lower than 2.7.2 may be vulnerable to the "billion laughs", +"quadratic blowup" and "large tokens" vulnerabilities, or to disproportional +use of dynamic memory. Python bundles a copy of Expat, and whether the bundled or a system-wide Expat is being used by Python, depends on how the Python interpreter :option:`has been configured <--with-system-expat>` in your environment. From 31523ec4ecd42a817ad10d62b6bb87a4d8488177 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Mon, 20 Oct 2025 15:12:17 +0200 Subject: [PATCH 08/13] Use more active language Original idea by @hedsnz with adjustments --- Doc/library/xml.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index 3b5295632de4f8..acd8d399fe32fc 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -65,8 +65,8 @@ connections. Expat versions lower than 2.7.2 may be vulnerable to the "billion laughs", "quadratic blowup" and "large tokens" vulnerabilities, or to disproportional use of dynamic memory. -Python bundles a copy of Expat, and whether the bundled or a system-wide Expat -is being used by Python, depends on how the Python interpreter +Python bundles a copy of Expat, and whether Python uses the bundled or a +system-wide Expat, depends on how the Python interpreter :option:`has been configured <--with-system-expat>` in your environment. Python may be vulnerable if it uses such older versions of Expat. Check :const:`!pyexpat.EXPAT_VERSION`. From fb5234e3146a1fef148b3f8280c2d34b6fe43437 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Mon, 20 Oct 2025 15:13:48 +0200 Subject: [PATCH 09/13] Drop news entry As suggested by @vstinner --- .../2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst | 5 ----- 1 file changed, 5 deletions(-) delete mode 100644 Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst diff --git a/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst b/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst deleted file mode 100644 index e834eb67a55f51..00000000000000 --- a/Misc/NEWS.d/next/Documentation/2025-09-30-20-57-26.gh-issue-139313.ibcC9q.rst +++ /dev/null @@ -1,5 +0,0 @@ -Improve documentation on -:doc:`XML security <../library/xml>` -and method -:meth:`~xml.parsers.expat.xmlparser.ExternalEntityRefHandler`. -Patch by Sebastian Pipping. From 3dc3baff4b7893633187ca0cfa240a47f3475dd8 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Wed, 29 Oct 2025 16:40:23 +0100 Subject: [PATCH 10/13] Re-write warning about xmlparser.ExternalEntityRefHandler The previous version was apparantly not clear enough. --- Doc/library/pyexpat.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 3d9ae9f326ac6d..78f40503873814 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -616,9 +616,12 @@ otherwise stated. .. warning:: - Registering a handler for external entity references may allow - attacker-controller XML to access local files and/or the network, - and thus create new security risks. + Implementing a handler that accesses local files and/or the network + may create a vulnerabilitiy to + `external entity attacks `_ + if :class:`xmlparser` is used with user-provided XML content. + Please reflect on your `threat model _` + before implementing this handler. By default, :class:`xmlparser` is safe from these threats. Called for references to external entities. *base* is the current base, as set From e9e1e625ee358a53f531acbb8835177e6dd4095d Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Wed, 29 Oct 2025 16:50:37 +0100 Subject: [PATCH 11/13] Fix RST syntax --- Doc/library/pyexpat.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 78f40503873814..0aba72f878220a 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -620,7 +620,7 @@ otherwise stated. may create a vulnerabilitiy to `external entity attacks `_ if :class:`xmlparser` is used with user-provided XML content. - Please reflect on your `threat model _` + Please reflect on your `threat model `_ before implementing this handler. By default, :class:`xmlparser` is safe from these threats. From 98da91ae2d34058605d5fa31ceaebc8d512036f0 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Wed, 29 Oct 2025 18:27:11 +0100 Subject: [PATCH 12/13] Drop a sentence of arguable value --- Doc/library/pyexpat.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 0aba72f878220a..2dd617dbf3b450 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -622,7 +622,6 @@ otherwise stated. if :class:`xmlparser` is used with user-provided XML content. Please reflect on your `threat model `_ before implementing this handler. - By default, :class:`xmlparser` is safe from these threats. Called for references to external entities. *base* is the current base, as set by a previous call to :meth:`SetBase`. The public and system identifiers, From 5fe73a1a3d39062f4e00064ce93acf07da96cd1e Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Wed, 29 Oct 2025 18:30:14 +0100 Subject: [PATCH 13/13] Fix typo --- Doc/library/pyexpat.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 2dd617dbf3b450..12fe9771259e6c 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -617,7 +617,7 @@ otherwise stated. .. warning:: Implementing a handler that accesses local files and/or the network - may create a vulnerabilitiy to + may create a vulnerability to `external entity attacks `_ if :class:`xmlparser` is used with user-provided XML content. Please reflect on your `threat model `_