1+ .. _pandas_docstring :
2+
13====================================
24How to write a good pandas docstring
35====================================
@@ -18,7 +20,7 @@ Next example gives an idea on how a docstring looks like:
1820.. code-block :: python
1921
2022 def add (num1 , num2 ):
21- """ Add up to integer numbers.
23+ """ Add up two integer numbers.
2224
2325 This function simply wraps the `+` operator, and does not
2426 do anything interesting, except for illustrating what is
@@ -47,15 +49,15 @@ Next example gives an idea on how a docstring looks like:
4749 """
4850 return num1 + num2
4951
50- To make it easier to understand docstrings, and to make it possible to export
51- them to html, some standards exist .
52+ Some standards exist about docstrings, so they are easier to read, and they can
53+ be exported to other formats such as html or pdf .
5254
5355The first conventions every Python docstring should follow are defined in
5456`PEP-257 <https://www.python.org/dev/peps/pep-0257/ >`_.
5557
56- As PEP-257 is quite open, some other standards exist. In the case of pandas,
57- the numpy docstring convention is followed. There are two main documents
58- that explain this convention:
58+ As PEP-257 is quite open, and some other standards exist on top of it. In the
59+ case of pandas, the numpy docstring convention is followed. There are two main
60+ documents that explain this convention:
5961
6062- `Guide to NumPy/SciPy documentation <https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt >`_
6163- `numpydoc docstring guide <http://numpydoc.readthedocs.io/en/latest/format.html >`_
@@ -82,7 +84,7 @@ General rules
8284Docstrings must be defined with three double-quotes. No blank lines should be
8385left before or after the docstring. The text starts immediately after the
8486opening quotes (not in the next line). The closing quotes have their own line
85- (and are not added at the end of the last sentence).
87+ (meaning that they are not at the end of the last sentence).
8688
8789**Good: **
8890
@@ -196,7 +198,8 @@ every paragraph in the extended summary is finished by a dot.
196198 is repeated for the main index, and data is easier to visualize as a
197199 pivot table.
198200
199- The index level will be automatically when added as columns.
201+ The index level will be automatically removed from the index when added
202+ as columns.
200203 """
201204 pass
202205
@@ -213,7 +216,7 @@ After the title, each parameter in the signature must be documented, including
213216`*args ` and `**kwargs `, but not `self `.
214217
215218The parameters are defined by their name, followed by a space, a colon, another
216- space, and the type (or type ). Note that the space between the name and the
219+ space, and the type (or types ). Note that the space between the name and the
217220colon is important. Types are not defined for `*args ` and `**kwargs `, but must
218221be defined for all other parameters. After the parameter definition, it is
219222required to have a line with the parameter description, which is indented, and
@@ -259,7 +262,7 @@ finish with a dot.
259262 Also, note that the parameter descriptions do not start with a
260263 capital letter, and do not finish with a dot.
261264
262- Finally, the `**kwargs` is missing.
265+ Finally, the `**kwargs` parameter is missing.
263266
264267 Parameters
265268 ----------
@@ -281,10 +284,10 @@ directly:
281284
282285For complex types, define the subtypes:
283286
284- - list of int
285- - dict of str : int
287+ - list of [ int]
288+ - dict of { str : int}
286289- tuple of (str, int, int)
287- - set of str
290+ - set of { str}
288291
289292In case there are just a set of values allowed, list them in curly brackets
290293and separated by commas (followed by a space):
@@ -298,7 +301,7 @@ If the type is defined in a Python module, the module must be specified:
298301- datetime.datetime
299302- decimal.Decimal
300303
301- If the type is in a package, the module must be equally specified:
304+ If the type is in a package, the module must be also specified:
302305
303306- numpy.ndarray
304307- scipy.sparse.coo_matrix
@@ -315,6 +318,9 @@ last two types, that need to be separated by the word 'or':
315318- float, decimal.Decimal or None
316319- str or list of str
317320
321+ If None is one of the accepted values, it always needs to be the last in
322+ the list.
323+
318324Section 4: Returns or Yields
319325~~~~~~~~~~~~~~~~~~~~~~~~~~~~
320326
@@ -396,15 +402,15 @@ Section 5: See also
396402This is an optional section, used to let users know about pandas functionality
397403related to the one being documented.
398404
399- An obvious example would be the `head() ` and `tail() ` method . As `tail() ` does
405+ An obvious example would be the `head() ` and `tail() ` methods . As `tail() ` does
400406the equivalent as `head() ` but at the end of the `Series ` or `DataFrame `
401407instead of at the beginning, it is good to let the users know about it.
402408
403409To give an intuition on what can be considered related, here there are some
404410examples:
405411
406412* `loc ` and `iloc `, as they do the same, but in one case providing indices and
407- int the other positions
413+ in the other positions
408414* `max ` and `min `, as they do the opposite
409415* `iterrows `, `itertuples ` and `iteritems `, as it is easy that a user looking
410416 for the method to iterate over columns ends up in the method to iterate
@@ -416,13 +422,12 @@ examples:
416422 of `astype ` to know how to cast as a date, and the way to do it is with
417423 `pandas.to_datetime `
418424
419- But when deciding what is related, you should mainly use your common sense and
425+ When deciding what is related, you should mainly use your common sense and
420426think about what can be useful for the users reading the documentation,
421427especially the less experienced ones.
422428
423429This section, as the previous, also has a header, "See Also" (note the capital
424- S and A) in this case. Also followed by the line with hyphens, and preceded by
425- a blank line.
430+ S and A). Also followed by the line with hyphens, and preceded by a blank line.
426431
427432After the header, we will add a line for each related method or function,
428433followed by a space, a colon, another space, and a short description that
@@ -459,7 +464,7 @@ Section 6: Notes
459464~~~~~~~~~~~~~~~~
460465
461466This is an optional section used for notes about the implementation of the
462- algorithm.
467+ algorithm. Or to document technical aspects of the function behavior.
463468
464469Feel free to skip it, unless you are familiar with the implementation of the
465470algorithm, or you discover some counter-intuitive behavior while writing the
@@ -485,9 +490,15 @@ output (no blank lines in between). Comments describing the examples can
485490be added with blank lines before and after them.
486491
487492The way to present examples is as follows:
488- 1. Create the data required to demostrate the usage
489- 2. Show a very basic example that gives an idea of the most common use case
490- 3. Add examples that illustrate how the parameters can be used
493+
494+ 1. Import required libraries
495+
496+ 2. Create the data required for the example
497+
498+ 3. Show a very basic example that gives an idea of the most common use case
499+
500+ 4. Add commented examples that illustrate how the parameters can be used for
501+ extended functionality
491502
492503A simple example could be:
493504
@@ -582,7 +593,10 @@ it easier to understand the concept. Unless required by the example, use
582593names of animals, to keep examples consistent. And numerical properties of
583594them.
584595
585- **Wrong: **
596+ When calling the method, keywords arguments `head(n=3) ` are preferred to
597+ positional arguments `head(3) `.
598+
599+ **Good: **
586600
587601.. code-block :: python
588602
@@ -593,12 +607,13 @@ them.
593607 --------
594608 >>> import numpy
595609 >>> import pandas
596- >>> df = pandas.DataFrame(numpy.random.randn(3, 3),
597- ... columns=('a', 'b', 'c'))
610+ >>> df = pandas.DataFrame([389., 24., 80.5, numpy.nan]
611+ ... columns=('max_speed'),
612+ ... index=['falcon', 'parrot', 'lion', 'monkey'])
598613 """
599614 pass
600615
601- **Good : **
616+ **Bad : **
602617
603618.. code-block :: python
604619
@@ -609,9 +624,8 @@ them.
609624 --------
610625 >>> import numpy
611626 >>> import pandas
612- >>> df = pandas.DataFrame([389., 24., 80.5, numpy.nan]
613- ... columns=('max_speed'),
614- ... index=['falcon', 'parrot', 'lion', 'monkey'])
627+ >>> df = pandas.DataFrame(numpy.random.randn(3, 3),
628+ ... columns=('a', 'b', 'c'))
615629 """
616630 pass
617631
0 commit comments