Skip to content

Commit a9cb879

Browse files
PEP 769: Add a 'default' keyword argument to 'attrgetter' and 'itemgetter' (#4179)
* Add PEP about default in attrgetter and itemgetter. * Wrapped lines. * Fixed date format and a title underlining. * Fixed date format and a title underlining. * Changed PEP number. * Changes as per review. * Fixed internal links. --------- Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
1 parent 7df085f commit a9cb879

File tree

2 files changed

+364
-0
lines changed

2 files changed

+364
-0
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -646,6 +646,7 @@ peps/pep-0765.rst @iritkatriel @ncoghlan
646646
peps/pep-0766.rst @warsaw
647647
peps/pep-0767.rst @carljm
648648
peps/pep-0768.rst @pablogsal
649+
peps/pep-0769.rst @facundobatista
649650
peps/pep-0770.rst @sethmlarson @brettcannon
650651
# ...
651652
peps/pep-0777.rst @warsaw

peps/pep-0769.rst

Lines changed: 363 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,363 @@
1+
PEP: 769
2+
Title: Add a 'default' keyword argument to 'attrgetter' and 'itemgetter'
3+
Author: Facundo Batista <facundo@taniquetil.com.ar>
4+
Status: Draft
5+
Type: Standards Track
6+
Created: 22-Dec-2024
7+
Python-Version: 3.14
8+
9+
10+
Abstract
11+
========
12+
13+
This proposal aims to enhance the ``operator`` module by adding a
14+
``default`` keyword argument to the ``attrgetter`` and ``itemgetter``
15+
functions. This addition would allow these functions to return a
16+
specified default value when the targeted attribute or item is missing,
17+
thereby preventing exceptions and simplifying code that handles optional
18+
attributes or items.
19+
20+
21+
Motivation
22+
==========
23+
24+
Currently, ``attrgetter`` and ``itemgetter`` raise exceptions if the
25+
specified attribute or item is absent. This limitation requires
26+
developers to implement additional error handling, leading to more
27+
complex and less readable code.
28+
29+
Introducing a ``default`` parameter would streamline operations involving
30+
optional attributes or items, reducing boilerplate code and enhancing
31+
code clarity.
32+
33+
34+
Rationale
35+
=========
36+
37+
The primary design decision is to introduce a single ``default`` parameter
38+
applicable to all specified attributes or items.
39+
40+
This approach maintains simplicity and avoids the complexity of assigning
41+
individual default values to multiple attributes or items. While some
42+
discussions considered allowing multiple defaults, the increased
43+
complexity and potential for confusion led to favoring a single default
44+
value for all cases (more about this below in `Rejected Ideas
45+
<PEP 769 Rejected Ideas_>`__).
46+
47+
48+
Specification
49+
=============
50+
51+
Proposed behaviours:
52+
53+
- **attrgetter**: ``f = attrgetter("name", default=XYZ)`` followed by
54+
``f(obj)`` would return ``obj.name`` if the attribute exists, else
55+
``XYZ``.
56+
57+
- **itemgetter**: ``f = itemgetter(2, default=XYZ)`` followed by
58+
``f(obj)`` would return ``obj[2]`` if that is valid, else ``XYZ``.
59+
60+
This enhancement applies to single and multiple attribute/item
61+
retrievals, with the default value returned for any missing attribute or
62+
item.
63+
64+
No functionality change is incorporated if ``default`` is not used.
65+
66+
67+
Examples for attrgetter
68+
-----------------------
69+
70+
Current behaviour, no changes introduced::
71+
72+
>>> class C:
73+
... class D:
74+
... class X:
75+
... pass
76+
... class E:
77+
... pass
78+
...
79+
>>> attrgetter("D")(C)
80+
<class '__main__.C.D'>
81+
>>> attrgetter("badname")(C)
82+
Traceback (most recent call last):
83+
File "<stdin>", line 1, in <module>
84+
AttributeError: type object 'C' has no attribute 'badname'
85+
>>> attrgetter("D", "E")(C)
86+
(<class '__main__.C.D'>, <class '__main__.C.E'>)
87+
>>> attrgetter("D", "badname")(C)
88+
Traceback (most recent call last):
89+
File "<stdin>", line 1, in <module>
90+
AttributeError: type object 'C' has no attribute 'badname'
91+
>>> attrgetter("D.X")(C)
92+
<class '__main__.C.D.X'>
93+
>>> attrgetter("D.badname")(C)
94+
Traceback (most recent call last):
95+
File "<stdin>", line 1, in <module>
96+
AttributeError: type object 'D' has no attribute 'badname'
97+
98+
Using ``default``::
99+
100+
>>> attrgetter("D", default="noclass")(C)
101+
<class '__main__.C.D'>
102+
>>> attrgetter("badname", default="noclass")(C)
103+
'noclass'
104+
>>> attrgetter("D", "E", default="noclass")(C)
105+
(<class '__main__.C.D'>, <class '__main__.C.E'>)
106+
>>> attrgetter("D", "badname", default="noclass")(C)
107+
(<class '__main__.C.D'>, 'noclass')
108+
>>> attrgetter("D.X", default="noclass")(C)
109+
<class '__main__.C.D.X'>
110+
>>> attrgetter("D.badname", default="noclass")(C)
111+
'noclass'
112+
113+
114+
Examples for itemgetter
115+
-----------------------
116+
117+
Current behaviour, no changes introduced::
118+
119+
>>> obj = ["foo", "bar", "baz"]
120+
>>> itemgetter(1)(obj)
121+
'bar'
122+
>>> itemgetter(5)(obj)
123+
Traceback (most recent call last):
124+
File "<stdin>", line 1, in <module>
125+
IndexError: list index out of range
126+
>>> itemgetter(1, 0)(obj)
127+
('bar', 'foo')
128+
>>> itemgetter(1, 5)(obj)
129+
Traceback (most recent call last):
130+
File "<stdin>", line 1, in <module>
131+
IndexError: list index out of range
132+
133+
134+
Using ``default``::
135+
136+
>>> itemgetter(1, default="XYZ")(obj)
137+
'bar'
138+
>>> itemgetter(5, default="XYZ")(obj)
139+
'XYZ'
140+
>>> itemgetter(1, 0, default="XYZ")(obj)
141+
('bar', 'foo')
142+
>>> itemgetter(1, 5, default="XYZ")(obj)
143+
('bar', 'XYZ')
144+
145+
146+
.. _PEP 769 About Possible Implementations:
147+
148+
About Possible Implementations
149+
------------------------------
150+
151+
For the case of ``attrgetter`` is quite direct: it implies using
152+
``getattr`` catching a possible ``AttributeError``. So
153+
``attrgetter("name", default=XYZ)(obj)`` would be like::
154+
155+
try:
156+
value = getattr(obj, "name")
157+
except (TypeError, IndexError, KeyError):
158+
value = XYZ
159+
160+
Note we cannot rely on using ``gettattr`` with a default value, as would
161+
be impossible to distinguish what it returned on each step when an
162+
attribute chain is specified (e.g.
163+
``attrgetter("foo.bar.baz", default=XYZ)``).
164+
165+
For the case of ``itemgetter`` it's not that easy. The more
166+
straightforward way is similar to above, also simple to define and
167+
understand: attempting ``__getitem__`` and catching a possible exception
168+
(any of the three indicated in ``__getitem__`` reference). This way,
169+
``itemgetter(123, default=XYZ)(obj)`` would be equivalent to::
170+
171+
try:
172+
value = obj[123]
173+
except (TypeError, IndexError, KeyError):
174+
value = XYZ
175+
176+
However, this would be not as efficient as we'd want for particular cases,
177+
e.g. using dictionaries where particularly good performance is desired. A
178+
more complex alternative would be::
179+
180+
if isinstance(obj, dict):
181+
value = obj.get(123, XYZ)
182+
else:
183+
try:
184+
value = obj[123]
185+
except (TypeError, IndexError, KeyError):
186+
value = XYZ
187+
188+
Better performance, more complicated to implement and explain. This is
189+
the first case in the `Open Issues <PEP 769 Open Issues_>`__ section later.
190+
191+
192+
Corner Cases
193+
------------
194+
195+
Providing a ``default`` option would only work when accessing to the
196+
item/attribute would fail in a regular situation. In other words, the
197+
object accessed should not handle defaults theirselves.
198+
199+
For example, the following would be redundant/confusing because
200+
``defaultdict`` will never error out when accessing the item::
201+
202+
>>> from collections import defaultdict
203+
>>> from operator import itemgetter
204+
>>> dd = defaultdict(int)
205+
>>> itemgetter("foo", default=-1)(dd)
206+
0
207+
208+
The same applies to any user built object that overloads ``__getitem__``
209+
or ``__getattr__`` implementing fallbacks.
210+
211+
212+
.. _PEP 769 Rejected Ideas:
213+
214+
Rejected Ideas
215+
==============
216+
217+
Multiple Default Values
218+
-----------------------
219+
220+
The idea of allowing multiple default values for multiple attributes or
221+
items was considered.
222+
223+
Two alternatives were discussed, using an iterable that must have the
224+
same quantity of items than parameters given to
225+
``attrgetter``/``itemgetter``, or using a dictionary with keys matching
226+
those names passed to ``attrgetter``/``itemgetter``.
227+
228+
The really complex thing to solve in these casse, that would make the
229+
feature hard to explain and with confusing corners, is what would happen
230+
if an iterable or dictionary is the *unique* default desired for all
231+
items. For example::
232+
233+
>>> itemgetter("a", default=(1, 2)({})
234+
(1, 2)
235+
>>> itemgetter("a", "b", default=(1, 2))({})
236+
((1, 2), (1, 2))
237+
238+
If we allow "multiple default values" using ``default``, the first case
239+
in the example above would raise an exception because more items in the
240+
default than names, and the second case would return ``(1, 2))``. This is
241+
why emerged the possibility of using a different name for multiple
242+
defaults (``defaults``, which is expressive but maybe error prone because
243+
too similar to ``default``).
244+
245+
As part of this conversation there was another proposal that would enable
246+
multiple defaults, which is allowing combinations of ``attrgetter`` and
247+
``itemgetter``, e.g.::
248+
249+
>>> ig_a = itemgetter("a", default=1)
250+
>>> ig_b = itemgetter("b", default=2)
251+
>>> ig_combined = itemgetter(ig_a, ig_b)
252+
>>> ig_combined({"a": 999})
253+
(999, 2)
254+
>>> ig_combined({})
255+
(1, 2)
256+
257+
However, combining ``itemgetter`` or ``attrgetter`` is a totally new
258+
behaviour very complex to define, not impossible, but beyond the scope of
259+
this PEP.
260+
261+
At the end having multiple default values was deemed overly complex and
262+
potentially confusing, and a single ``default`` parameter was favored for
263+
simplicity and predictability.
264+
265+
266+
Tuple Return Consistency
267+
------------------------
268+
269+
Another rejected proposal was adding a a flag to always return tuple
270+
regardless of how many keys/names/indices were sourced to arguments.
271+
E.g.::
272+
273+
>>> letters = ["a", "b", "c"]
274+
>>> itemgetter(1, return_tuple=True)(letters)
275+
('b',)
276+
>>> itemgetter(1, 2, return_tuple=True)(letters)
277+
('b', 'c')
278+
279+
This would be of a little help for multiple default values consistency,
280+
but requires further discussion and for sure is out of the scope of this
281+
PEP.
282+
283+
284+
.. _PEP 769 Open Issues:
285+
286+
Open Issues
287+
===========
288+
289+
Behaviour Equivalence for ``itemgetter``
290+
----------------------------------------
291+
292+
We need to define how ``itemgetter`` would behave, if just attempt to
293+
access the item and capture exceptions no matter which the object, or
294+
validate first if the object provides a ``get`` method and use it to
295+
retrieve the item with a default. See examples in the `About Possible
296+
Implementations <PEP 769 About Possible Implementations_>`__ subsection
297+
above.
298+
299+
This would help performance for the case of dictionaries, but would make
300+
the ``default`` feature somewhat more difficult to explain, and a little
301+
confusing if some object that is not a dictionary but provides a ``get``
302+
method is used. Alternatively, we could call ``.get`` *only* if the
303+
object is an instance of ``dict``.
304+
305+
In any case, a desirable situation is that we do *not* affect performance
306+
at all if the ``default`` is not triggered. Checking for ``.get`` would
307+
get the default faster in case of dicts, but implies doing a verification
308+
in all cases. Using the try/except model would make it not as fast as it
309+
could in the case of dictionaries, but would not introduce delays if the
310+
default is not triggered.
311+
312+
313+
Add a Default to ``getitem``
314+
----------------------------
315+
316+
It was proposed that we could also enhance ``getitem``, as part of the of
317+
this PEP, adding ``default`` also to it.
318+
319+
This will not only improve ``getitem`` itself, but we would also gain
320+
internal consistency in the ``operator`` module and in comparison with
321+
the ``getattr`` builtin function that also has a default.
322+
323+
The definition could be as simple as the try/except proposed above, so
324+
doing ``getitem(obj, name, default)`` would be equivalent to::
325+
326+
try:
327+
result = obj[name]
328+
except (TypeError, IndexError, KeyError):
329+
result = default
330+
331+
(However see previous open issue about special case for dictionaries)
332+
333+
334+
How to Teach This
335+
=================
336+
337+
As the basic behaviour is not modified, this new ``default`` can be
338+
avoided when teaching ``attrgetter`` and ``itemgetter`` for the first
339+
time, and can be introduced only when the functionality need arises.
340+
341+
342+
Backwards Compatibility
343+
=======================
344+
345+
The proposed changes are backward-compatible. The ``default`` parameter
346+
is optional; existing code without this parameter will function as
347+
before. Only code that explicitly uses the new ``default`` parameter will
348+
exhibit the new behavior, ensuring no disruption to current
349+
implementations.
350+
351+
352+
Security Implications
353+
=====================
354+
355+
Introducing a ``default`` parameter does not inherently introduce
356+
security vulnerabilities.
357+
358+
359+
Copyright
360+
=========
361+
362+
This document is placed in the public domain or under the
363+
CC0-1.0-Universal license, whichever is more permissive.

0 commit comments

Comments
 (0)