Skip to content

Commit 614420d

Browse files
authored
gh-85679: Recommend encoding="utf-8" in tutorial (GH-91778)
1 parent d414f7e commit 614420d

File tree

1 file changed

+18
-10
lines changed

1 file changed

+18
-10
lines changed

Doc/tutorial/inputoutput.rst

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -279,11 +279,12 @@ Reading and Writing Files
279279
object: file
280280

281281
:func:`open` returns a :term:`file object`, and is most commonly used with
282-
two arguments: ``open(filename, mode)``.
282+
two positional arguments and one keyword argument:
283+
``open(filename, mode, encoding=None)``
283284

284285
::
285286

286-
>>> f = open('workfile', 'w')
287+
>>> f = open('workfile', 'w', encoding="utf-8")
287288

288289
.. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4>
289290
@@ -300,11 +301,14 @@ writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
300301
omitted.
301302

302303
Normally, files are opened in :dfn:`text mode`, that means, you read and write
303-
strings from and to the file, which are encoded in a specific encoding. If
304-
encoding is not specified, the default is platform dependent (see
305-
:func:`open`). ``'b'`` appended to the mode opens the file in
306-
:dfn:`binary mode`: now the data is read and written in the form of bytes
307-
objects. This mode should be used for all files that don't contain text.
304+
strings from and to the file, which are encoded in a specific *encoding*.
305+
If *encoding* is not specified, the default is platform dependent
306+
(see :func:`open`).
307+
Because UTF-8 is the modern de-facto standard, ``encoding="utf-8"`` is
308+
recommended unless you know that you need to use a different encoding.
309+
Appending a ``'b'`` to the mode opens the file in :dfn:`binary mode`.
310+
Binary mode data is read and written as :class:`bytes` objects.
311+
You can not specify *encoding* when opening file in binary mode.
308312

309313
In text mode, the default when reading is to convert platform-specific line
310314
endings (``\n`` on Unix, ``\r\n`` on Windows) to just ``\n``. When writing in
@@ -320,7 +324,7 @@ after its suite finishes, even if an exception is raised at some
320324
point. Using :keyword:`!with` is also much shorter than writing
321325
equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
322326

323-
>>> with open('workfile') as f:
327+
>>> with open('workfile', encoding="utf-8") as f:
324328
... read_data = f.read()
325329

326330
>>> # We can check that the file has been automatically closed.
@@ -490,11 +494,15 @@ simply serializes the object to a :term:`text file`. So if ``f`` is a
490494

491495
json.dump(x, f)
492496

493-
To decode the object again, if ``f`` is a :term:`text file` object which has
494-
been opened for reading::
497+
To decode the object again, if ``f`` is a :term:`binary file` or
498+
:term:`text file` object which has been opened for reading::
495499

496500
x = json.load(f)
497501

502+
.. note::
503+
JSON files must be encoded in UTF-8. Use ``encoding="utf-8"`` when opening
504+
JSON file as a :term:`text file` for both of reading and writing.
505+
498506
This simple serialization technique can handle lists and dictionaries, but
499507
serializing arbitrary class instances in JSON requires a bit of extra effort.
500508
The reference for the :mod:`json` module contains an explanation of this.

0 commit comments

Comments
 (0)