Skip to content

Commit 6e678e3

Browse files
DOC: clarify migration guide for users of existing string dtypes (#63359)
1 parent 04a554c commit 6e678e3

File tree

1 file changed

+27
-3
lines changed

1 file changed

+27
-3
lines changed

doc/source/user_guide/migration-3-strings.rst

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,14 @@ enable it with:
2020
2121
This allows you to test your code before the final 3.0 release.
2222

23+
.. note::
24+
25+
This migration guide focuses on the changes and migration steps needed when
26+
you are currently using ``object`` dtype for string data, which is used by
27+
default in pandas < 3.0. If you are already using one of the opt-in string
28+
dtypes, you can continue to do so without change.
29+
See :ref:`string_migration_guide-for_existing_users` for more details.
30+
2331
Background
2432
----------
2533

@@ -457,7 +465,23 @@ raise an error regardless of the number of strings:
457465
...
458466
TypeError: Cannot perform reduction 'prod' with string dtype
459467
460-
.. For existing users of the nullable ``StringDtype``
461-
.. --------------------------------------------------
462468
463-
.. TODO
469+
.. _string_migration_guide-for_existing_users:
470+
471+
For existing users of the nullable ``StringDtype``
472+
--------------------------------------------------
473+
474+
While pandas 3.0 introduces a new _default_ string data type, pandas had an
475+
opt-in nullable string data type since pandas 1.0, which can be specified using
476+
``dtype="string"``. This nullable string dtype uses ``pd.NA`` as the missing
477+
value indicator. In addition, also through :class:`ArrowDtype` (by using
478+
``dtypes_backend="pyarrow"``) since pandas 1.5, one could already make use of
479+
a dedicated string dtype.
480+
481+
If you are already using one of the nullable string dtypes, for example by
482+
specifying ``dtype="string"``, by using :meth:`~DataFrame.convert_dtypes`, or
483+
by specifying the ``dtype_backend`` argument in IO functions, you can continue
484+
to do so without change.
485+
486+
The migration guide above applies to code that is currently (< 3.0) using object
487+
dtype for string data.

0 commit comments

Comments
 (0)