Skip to content

[3.14] gh-74902: Add Unicode Grapheme Cluster Break algorithm (GH-143076)#148247

Draft
StanFromIreland wants to merge 3 commits intopython:3.14from
StanFromIreland:backport-bab1d7a-3.14
Draft

[3.14] gh-74902: Add Unicode Grapheme Cluster Break algorithm (GH-143076)#148247
StanFromIreland wants to merge 3 commits intopython:3.14from
StanFromIreland:backport-bab1d7a-3.14

Conversation

@StanFromIreland
Copy link
Copy Markdown
Member

@StanFromIreland StanFromIreland commented Apr 8, 2026

Add the unicodedata._iter_graphemes() function to iterate over grapheme clusters according to rules defined in Unicode Standard Annex #29.

Add unicodedata._grapheme_cluster_break(), unicodedata._indic_conjunct_break() and unicodedata._extended_pictographic() functions to get the properties of the character which are related to the above algorithm.

(cherry picked from commits bab1d7a, 85013d7 and 58ccf21)

serhiy-storchaka and others added 3 commits April 8, 2026 13:21
…ythonGH-143076)

Add the unicodedata.iter_graphemes() function to iterate over grapheme
clusters according to rules defined in Unicode Standard Annex #29.

Add unicodedata.grapheme_cluster_break(), unicodedata.indic_conjunct_break()
and unicodedata.extended_pictographic() functions to get the properties
of the character which are related to the above algorithm.
(cherry picked from commit bab1d7a)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Guillaume "Vermeille" Sanchez <guillaume.v.sanchez@gmail.com>
@StanFromIreland
Copy link
Copy Markdown
Member Author

StanFromIreland commented Apr 8, 2026

This is an alternative proposal for #148218.

This would also require updating testpython.net with the file for UCD 16. Due to this, the network test currently fails.

@vstinner
Copy link
Copy Markdown
Member

vstinner commented Apr 8, 2026

I don't think that it's a good idea to backport so much code in a stable branch. I don't think that fixing the issue in Python 3.14 is important enough to justify this backport. Python had this bug for many years.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions Bot added the stale Stale PR or inactive for long period of time. label May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DO-NOT-MERGE skip news stale Stale PR or inactive for long period of time.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants