Proposed API: add PyDict_FromItems() #55

vstinner · 2025-12-04T16:08:12Z

No description provided.

encukou · 2025-12-05T09:49:06Z

Proposed_API/pydict_fromitems.rst

+INADA-san wrote that most users either overestimate its effectiveness or don't
+fully understand how it operates.


Doesn't this apply to some other proposals as well?
How do the other functions behave if input contains duplicate keys?

Doesn't this apply to some other proposals as well?

I don't know, I don't want to speak for @methane.

How do the other functions behave if input contains duplicate keys?

So far, all proposed functions allocates N items even if there are duplicate keys.

I wrote some points that many people using the private API don't know:
python/cpython#139772 (comment)

Until Python 3.5, resizing the dict is done by reinserting all the elements into a new hash table. It was slow.
Python 3.6 separated the hash table and the entry array. Since then, hash table reconstruction is fast, and the entry array is copied with memcpy.
People who evaluated the effect of _PyDict_NewPresized() with microbenchmarks before Python 3.6 overestimated the effect nowadays.

I ported @vstinner 's benchmark to Python 3.5.

https://github.com/methane/notes/tree/main/2025/dictnew

Python 3.5

Benchmark dict_new dict_presized

dict-1 454 ns 398 ns: 1.14x faster

dict-5 2.07 us 1.73 us: 1.20x faster

dict-10 3.96 us 3.28 us: 1.21x faster

dict-25 9.25 us 7.56 us: 1.22x faster

dict-100 31.4 us 25.6 us: 1.22x faster

dict-500 113 us 103 us: 1.11x faster

dict-1,000 218 us 200 us: 1.09x faster

Geometric mean (ref) 1.17x faster

Python 3.12

Benchmark dict_new dict_presized

dict-1 378 ns 337 ns: 1.12x faster

dict-5 1.49 us 1.34 us: 1.11x faster

dict-10 2.65 us 2.32 us: 1.14x faster

dict-25 5.84 us 5.12 us: 1.14x faster

dict-100 22.0 us 18.0 us: 1.22x faster

dict-500 98.6 us 87.6 us: 1.13x faster

dict-1,000 194 us 172 us: 1.13x faster

Geometric mean (ref) 1.14x faster

PyDict_New() in Python 3.12 is faster than _PyDict_NewPresized() in 3.5

_PyDict_NewPresized() vs PyDict_New() ratio become little small, but still significant.

encukou · 2025-12-05T10:08:36Z

Proposed_API/pydict_fromitems.rst

+Such function lacks an *override* argument to decide how to deal with
+overridden keys on updating an existing dictionary.


So if we add an override argument, there's no downside?

Well, I prefer PyDict_FromItems() to create a dictionary :-) PyDict_FromItems() has less parameters and so is simpler.

vstinner · 2025-12-17T16:27:12Z

I plan to merge this change next Friday, unless someone prefers to iterate on this PR.

I completed the document to "Unicode issue" and "False header problem" sections. I also added PyDict_SetAssumptions() API.

Proposed API: add PyDict_FromItems()

54f9911

vstinner mentioned this pull request Dec 4, 2025

Add PyDict_FromItems() function capi-workgroup/decisions#90

Open

encukou reviewed Dec 5, 2025

View reviewed changes

vstinner added 2 commits December 17, 2025 16:51

Add PyDict_SetAssumptions()

f05ac68

Add Unicode issue and False size header problem

7d9e04f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposed API: add PyDict_FromItems() #55

Proposed API: add PyDict_FromItems() #55

Uh oh!

vstinner commented Dec 4, 2025

Uh oh!

encukou Dec 5, 2025

Uh oh!

vstinner Dec 5, 2025

Uh oh!

methane Dec 8, 2025

Uh oh!

methane Dec 8, 2025

Uh oh!

encukou Dec 5, 2025 •

edited

Loading

Uh oh!

vstinner Dec 5, 2025 •

edited

Loading

Uh oh!

vstinner commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		INADA-san wrote that most users either overestimate its effectiveness or don't
		fully understand how it operates.

Benchmark	dict_new	dict_presized
dict-1	454 ns	398 ns: 1.14x faster
dict-5	2.07 us	1.73 us: 1.20x faster
dict-10	3.96 us	3.28 us: 1.21x faster
dict-25	9.25 us	7.56 us: 1.22x faster
dict-100	31.4 us	25.6 us: 1.22x faster
dict-500	113 us	103 us: 1.11x faster
dict-1,000	218 us	200 us: 1.09x faster
Geometric mean	(ref)	1.17x faster

Benchmark	dict_new	dict_presized
dict-1	378 ns	337 ns: 1.12x faster
dict-5	1.49 us	1.34 us: 1.11x faster
dict-10	2.65 us	2.32 us: 1.14x faster
dict-25	5.84 us	5.12 us: 1.14x faster
dict-100	22.0 us	18.0 us: 1.22x faster
dict-500	98.6 us	87.6 us: 1.13x faster
dict-1,000	194 us	172 us: 1.13x faster
Geometric mean	(ref)	1.14x faster

		Such function lacks an override argument to decide how to deal with
		overridden keys on updating an existing dictionary.

Proposed API: add PyDict_FromItems() #55

Are you sure you want to change the base?

Proposed API: add PyDict_FromItems() #55

Uh oh!

Conversation

vstinner commented Dec 4, 2025

Uh oh!

encukou Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

vstinner Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

methane Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

methane Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

encukou Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vstinner Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vstinner commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

encukou Dec 5, 2025 •

edited

Loading

vstinner Dec 5, 2025 •

edited

Loading