|
| 1 | +PEP: 712 |
| 2 | +Title: Adding a "converter" parameter to dataclasses.field |
| 3 | +Author: Joshua Cannon <joshdcannon@gmail.com> |
| 4 | +Sponsor: Eric V. Smith <eric at trueblade.com> |
| 5 | +Status: Draft |
| 6 | +Type: Standards Track |
| 7 | +Content-Type: text/x-rst |
| 8 | +Created: 01-Jan-2023 |
| 9 | +Python-Version: 3.13 |
| 10 | +Post-History: `27-Dec-2022 <https://mail.python.org/archives/list/typing-sig@python.org/thread/NWZQIINJQZDOCZGO6TGCUP2PNW4PEKNY/>`__, |
| 11 | + `19-Jan-2023 <https://discuss.python.org/t/add-converter-to-dataclass-field/22956>`__, |
| 12 | + |
| 13 | +Abstract |
| 14 | +======== |
| 15 | + |
| 16 | +:pep:`557` added :mod:`dataclasses` to the Python stdlib. :pep:`681` added |
| 17 | +:func:`~py3.11:typing.dataclass_transform` to help type checkers understand |
| 18 | +several common dataclass-like libraries, such as attrs, Pydantic, and object |
| 19 | +relational mapper (ORM) packages such as SQLAlchemy and Django. |
| 20 | + |
| 21 | +A common feature these libraries provide over the standard library |
| 22 | +implementation is the ability for the library to convert arguments given at |
| 23 | +initialization time into the types expected for each field using a |
| 24 | +user-provided conversion function. |
| 25 | + |
| 26 | +Therefore, this PEP adds a ``converter`` parameter to :func:`dataclasses.field` |
| 27 | +(along with the requisite changes to :class:`dataclasses.Field` and |
| 28 | +:func:`~py3.11:typing.dataclass_transform`) to specify the function to use to |
| 29 | +convert the input value for each field to the representation to be stored in |
| 30 | +the dataclass. |
| 31 | + |
| 32 | +Motivation |
| 33 | +========== |
| 34 | + |
| 35 | +There is no existing, standard way for :mod:`dataclasses` or third-party |
| 36 | +dataclass-like libraries to support argument conversion in a type-checkable |
| 37 | +way. To work around this limitation, library authors/users are forced to choose |
| 38 | +to: |
| 39 | + |
| 40 | +* Opt-in to a custom Mypy plugin. These plugins help Mypy understand the |
| 41 | + conversion semantics, but not other tools. |
| 42 | +* Shift conversion responsibility onto the caller of the dataclass |
| 43 | + constructor. This can make constructing certain dataclasses unnecessarily |
| 44 | + verbose and repetitive. |
| 45 | +* Provide a custom ``__init__`` which declares "wider" parameter types and |
| 46 | + converts them when setting the appropriate attribute. This not only duplicates |
| 47 | + the typing annotations between the converter and ``__init__``, but also opts |
| 48 | + the user out of many of the features :mod:`dataclasses` provides. |
| 49 | +* Provide a custom ``__init__`` but without meaningful type annotations |
| 50 | + for the parameter types requiring conversion. |
| 51 | + |
| 52 | +None of these choices are ideal. |
| 53 | + |
| 54 | +Rationale |
| 55 | +========= |
| 56 | + |
| 57 | +Adding argument conversion semantics is useful and beneficial enough that most |
| 58 | +dataclass-like libraries provide support. Adding this feature to the standard |
| 59 | +library means more users are able to opt-in to these benefits without requiring |
| 60 | +third-party libraries. Additionally third-party libraries are able to clue |
| 61 | +type-checkers into their own conversion semantics through added support in |
| 62 | +:func:`~py3.11:typing.dataclass_transform`, meaning users of those libraries |
| 63 | +benefit as well. |
| 64 | + |
| 65 | +Specification |
| 66 | +============= |
| 67 | + |
| 68 | +New ``converter`` parameter |
| 69 | +--------------------------- |
| 70 | + |
| 71 | +This specification introduces a new parameter named ``converter`` to the |
| 72 | +:func:`dataclasses.field` function. When an ``__init__`` method is synthesized |
| 73 | +by ``dataclass``-like semantics, if an argument is provided for the field, the |
| 74 | +``dataclass`` object's attribute will be assigned the result of calling the |
| 75 | +converter on the provided argument. If no argument is given and the field was |
| 76 | +constructed with a default value, the ``dataclass`` object's attribute will be |
| 77 | +assigned the result of calling the converter on the provided default. |
| 78 | + |
| 79 | +Adding this parameter also implies the following changes: |
| 80 | + |
| 81 | +* A ``converter`` attribute will be added to :class:`dataclasses.Field`. |
| 82 | +* ``converter`` will be added to :func:`~py3.11:typing.dataclass_transform`'s |
| 83 | + list of supported field specifier parameters. |
| 84 | + |
| 85 | +Example |
| 86 | +''''''' |
| 87 | + |
| 88 | +.. code-block:: python |
| 89 | +
|
| 90 | + @dataclasses.dataclass |
| 91 | + class InventoryItem: |
| 92 | + # `converter` as a type |
| 93 | + id: int = dataclasses.field(converter=int) |
| 94 | + skus: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...]) |
| 95 | + # `converter` as a callable |
| 96 | + names: tuple[str, ...] = dataclasses.field( |
| 97 | + converter=lambda names: tuple(map(str.lower, names)) |
| 98 | + ) |
| 99 | +
|
| 100 | + # The default value is also converted; therefore the following is not a |
| 101 | + # type error. |
| 102 | + stock_image_path: pathlib.PurePosixPath = dataclasses.field( |
| 103 | + converter=pathlib.PurePosixPath, default="assets/unknown.png" |
| 104 | + ) |
| 105 | +
|
| 106 | + item1 = InventoryItem("1", [234, 765], ["PYTHON PLUSHIE", "FLUFFY SNAKE"]) |
| 107 | + # item1 would have the following values: |
| 108 | + # id=1 |
| 109 | + # skus=(234, 765) |
| 110 | + # names=('python plushie', 'fluffy snake') |
| 111 | + # stock_image_path=pathlib.PurePosixPath("assets/unknown.png") |
| 112 | +
|
| 113 | +
|
| 114 | +Impact on typing |
| 115 | +---------------- |
| 116 | + |
| 117 | +A ``converter`` must be a callable that accepts a single positional argument, and |
| 118 | +the parameter type corresponding to this positional argument provides the type |
| 119 | +of the the synthesized ``__init__`` parameter associated with the field. |
| 120 | + |
| 121 | +In other words, the argument provided for the converter parameter must be |
| 122 | +compatible with ``Callable[[T], X]`` where ``T`` is the input type for |
| 123 | +the converter and ``X`` is the output type of the converter. |
| 124 | + |
| 125 | +Type-checking the default value |
| 126 | +''''''''''''''''''''''''''''''' |
| 127 | + |
| 128 | +Because the ``default`` value is unconditionally converted using ``converter``, |
| 129 | +if arguments for both ``converter`` and ``default`` are provided to |
| 130 | +:func:`dataclasses.field`, the ``default`` argument's type should be checked |
| 131 | +using the type of the single argument to the ``converter`` callable. |
| 132 | + |
| 133 | +Converter return type |
| 134 | +''''''''''''''''''''' |
| 135 | + |
| 136 | +The return type of the callable must be a type that's compatible with the |
| 137 | +field's declared type. This includes the field's type exactly, but can also be |
| 138 | +a type that's more specialized (such as a converter returning a ``list[int]`` |
| 139 | +for a field annotated as ``list``, or a converter returning an ``int`` for a |
| 140 | +field annotated as ``int | str``). |
| 141 | + |
| 142 | +Example |
| 143 | +''''''' |
| 144 | + |
| 145 | +.. code-block:: python |
| 146 | +
|
| 147 | + @dataclasses.dataclass |
| 148 | + class Example: |
| 149 | + my_int: int = dataclasses.field(converter=int) |
| 150 | + my_tuple: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...]) |
| 151 | + my_cheese: Cheese = dataclasses.field(converter=make_cheese) |
| 152 | +
|
| 153 | + # Although the default value is of type `str` and the field is declared to |
| 154 | + # be of type `pathlib.Path`, this is not a type error because the default |
| 155 | + # value will be converted. |
| 156 | + tmpdir: pathlib.Path = dataclasses.field(default="/tmp", converter=pathlib.Path) |
| 157 | +
|
| 158 | +
|
| 159 | +
|
| 160 | +Backward Compatibility |
| 161 | +====================== |
| 162 | + |
| 163 | +These changes don't introduce any compatibility problems since they |
| 164 | +only introduce opt-in new features. |
| 165 | + |
| 166 | +Security Implications |
| 167 | +====================== |
| 168 | + |
| 169 | +There are no direct security concerns with these changes. |
| 170 | + |
| 171 | +How to Teach This |
| 172 | +================= |
| 173 | + |
| 174 | +Documentation and examples explaining the new parameter and behavior will be |
| 175 | +added to the relevant sections of the docs site (primarily on |
| 176 | +:mod:`dataclasses`) and linked from the *What's New* document. |
| 177 | + |
| 178 | +The added documentation/examples will also cover the "common pitfalls" that |
| 179 | +users of converters are likely to encounter. Such pitfalls include: |
| 180 | + |
| 181 | +* Needing to handle ``None``/sentinel values. |
| 182 | +* Needing to handle values that are already of the correct type. |
| 183 | +* Avoiding lambdas for converters, as the synthesized ``__init__`` |
| 184 | + parameter's type will become ``Any``. |
| 185 | + |
| 186 | +Reference Implementation |
| 187 | +======================== |
| 188 | + |
| 189 | +The attrs library `already includes <attrs-converters_>`__ a ``converter`` |
| 190 | +parameter containing converter semantics. |
| 191 | + |
| 192 | +CPython support is implemented on `a branch in the author's fork <cpython-branch_>`__. |
| 193 | + |
| 194 | +Rejected Ideas |
| 195 | +============== |
| 196 | + |
| 197 | +Just adding "converter" to ``typing.dataclass_transform``'s ``field_specifiers`` |
| 198 | +-------------------------------------------------------------------------------- |
| 199 | + |
| 200 | +The idea of isolating this addition to |
| 201 | +:func:`~py3.11:typing.dataclass_transform` was briefly |
| 202 | +`discussed on Typing-SIG <only-dataclass-transform_>`__ where it was suggested |
| 203 | +to broaden this to :mod:`dataclasses` more generally. |
| 204 | + |
| 205 | +Additionally, adding this to :mod:`dataclasses` ensures anyone can reap the |
| 206 | +benefits without requiring additional libraries. |
| 207 | + |
| 208 | +Not converting default values |
| 209 | +----------------------------- |
| 210 | + |
| 211 | +There are pros and cons with both converting and not converting default values. |
| 212 | +Leaving default values as-is allows type-checkers and dataclass authors to |
| 213 | +expect that the type of the default matches the type of the field. However, |
| 214 | +converting default values has two large advantages: |
| 215 | + |
| 216 | +1. Compatibility with attrs. Attrs unconditionally uses the converter to |
| 217 | + convert the default value. |
| 218 | + |
| 219 | +2. Simpler defaults. Allowing the default value to have the same type as |
| 220 | + user-provided values means dataclass authors get the same conveniences as |
| 221 | + their callers. |
| 222 | + |
| 223 | +Automatic conversion using the field's type |
| 224 | +------------------------------------------- |
| 225 | + |
| 226 | +One idea could be to allow the type of the field specified (e.g. ``str`` or |
| 227 | +``int``) to be used as a converter for each argument provided. |
| 228 | +`Pydantic's data conversion <pydantic-data-conversion_>`__ has semantics which |
| 229 | +appear to be similar to this approach. |
| 230 | + |
| 231 | +This works well for fairly simple types, but leads to ambiguity in expected |
| 232 | +behavior for complex types such as generics. E.g. For ``tuple[int, ...]`` it is |
| 233 | +ambiguous if the converter is supposed to simply convert an iterable to a tuple, |
| 234 | +or if it is additionally supposed to convert each element type to ``int``. |
| 235 | + |
| 236 | +References |
| 237 | +========== |
| 238 | + |
| 239 | +.. _attrs-converters: https://www.attrs.org/en/21.2.0/examples.html#conversion |
| 240 | +.. _cpython-branch: https://github.com/thejcannon/cpython/tree/converter |
| 241 | +.. _only-dataclass-transform: https://mail.python.org/archives/list/typing-sig@python.org/thread/NWZQIINJQZDOCZGO6TGCUP2PNW4PEKNY/ |
| 242 | +.. _pydantic-data-conversion: https://docs.pydantic.dev/usage/models/#data-conversion |
| 243 | + |
| 244 | + |
| 245 | +Copyright |
| 246 | +========= |
| 247 | + |
| 248 | +This document is placed in the public domain or under the |
| 249 | +CC0-1.0-Universal license, whichever is more permissive. |
0 commit comments