Skip to content

Commit 32ae620

Browse files
committed
Full draft
1 parent dc844dd commit 32ae620

File tree

2 files changed

+291
-33
lines changed

2 files changed

+291
-33
lines changed

Doc/extending/first-extension-module.rst

Lines changed: 288 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Your first C API extension module
1010
This tutorial will take you through creating a simple
1111
Python extension module written in C or C++.
1212

13-
The tutorial assumes basic knowledge about Python: you should be able to
13+
It assumes basic knowledge about Python: you should be able to
1414
define functions in Python code before starting to write them in C.
1515
See :ref:`tutorial-index` for an introduction to Python itself.
1616

@@ -19,20 +19,18 @@ While we will mention several concepts that a C beginner would not be expected
1919
to know, like ``static`` functions or linkage declarations, understanding these
2020
is not necessary for success.
2121

22-
As a word warning before we begin: as the code is written, you will need to
23-
compile it with the right tools and settings for your system.
24-
It is generally best to use a third-party tool to handle the details.
25-
This is covered in later chapters, not in this tutorial.
22+
We will focus on giving you a "feel" of what Python's C API is like.
23+
It will not teach you important concepts, like error handling
24+
and reference counting, which are covered in later chapters.
2625

27-
The tutorial assumes that you use a Unix-like system (including macOS and
28-
Linux) or Windows.
29-
If you don't have a preference, currently you're most likely to get the best
30-
experience on Linux.
26+
As you write the code, you will need to compile it.
27+
Prepare to spend some time choosing, installing and configuring a build tool,
28+
since CPython itself does not include one.
3129

32-
If you learn better by having the full context at the beginning,
33-
skip to the end to see :ref:`the resulting source <first-extension-result>`,
34-
then return here.
35-
The tutorial will explain every line, although in a different order.
30+
We will assume that you use a Unix-like system (including macOS and
31+
Linux), or Windows.
32+
On other systems, you might need to adjust some details -- for example,
33+
a system command name.
3634

3735
.. note::
3836

@@ -48,20 +46,21 @@ The tutorial will explain every line, although in a different order.
4846
What we'll do
4947
=============
5048

51-
Let's create an extension module called ``spam`` (the favorite food of Monty
52-
Python fans...) and let's say we want to create a Python interface to the C
49+
Let's create an extension module called ``spam`` [#why-spam]_,
50+
which will include a Python interface to the C
5351
standard library function :c:func:`system`.
54-
This function takes a C string as argument, runs the argument as a system
52+
This function is defined in ``stdlib.h``.
53+
It takes a C string as argument, runs the argument as a system
5554
command, and returns a result value as an integer.
56-
In documentation, it might be summarized as::
55+
A manual page for ``system`` might summarize it this way::
5756

5857
#include <stdlib.h>
59-
6058
int system(const char *command);
6159

6260
Note that like many functions in the C standard library,
63-
this function is already exposed in Python, as :py:func:`os.system`.
64-
We'll make our own "wrapper".
61+
this function is already exposed in Python.
62+
In production, use :py:func:`os.system` or :py:func:`subprocess.run`
63+
rather than the module you'll write here.
6564

6665
We want this function to be callable from Python as follows:
6766

@@ -80,8 +79,8 @@ We want this function to be callable from Python as follows:
8079
both Unix and Windows.
8180

8281

83-
Warming up your build tool: an empty module
84-
===========================================
82+
Warming up your build tool
83+
==========================
8584

8685
Begin by creating a file named :file:`spammodule.c`.
8786
The name is entirely up to you, though some tools can be picky about
@@ -90,7 +89,7 @@ the ``.c`` extension.
9089
named ``spam``, for example.
9190
If you do this, you'll need to adjust some instructions below.)
9291

93-
Now, while the file is emptly, we'll compile it, so that you can make
92+
Now, while the file is empty, we'll compile it, so that you can make
9493
and test incremental changes as you follow the rest of the tutorial.
9594

9695
Choose a build tool such as Setuptools or Meson, and follow its instructions
@@ -305,17 +304,274 @@ Try ``help(spam)`` to see the docstring.
305304
The next step will be adding a function to it.
306305

307306

308-
Adding a function
309-
=================
307+
Exposing a function
308+
===================
309+
310+
To add a function to the module, you will need a C function to call.
311+
Add a minimal one now::
312+
313+
static PyObject *
314+
spam_system(PyObject *self, PyObject *arg)
315+
{
316+
Py_RETURN_NONE;
317+
}
318+
319+
This function is defined as ``static``.
320+
This will be a common theme: usually, the only non-``static`` function
321+
in a Python extension module is the export hook.
322+
Data declarations are typically ``static`` as well.
323+
324+
As a Python function, our ``spam_system`` returns a Python object.
325+
In the C API, this is represented as a pointer to the ``PyObject`` structure.
326+
327+
This function also takes two pointers to Python objects as arguments.
328+
The "real" argument, as passed in from Python code, will be *arg*.
329+
Meanwhile, *self* will be set to the module object -- in the C API,
330+
module-level functions behave a bit like "methods" of the module object.
331+
332+
The macro :c:macro:`Py_RETURN_NONE`, which we use as a body for now,
333+
properly ``return``\s a Python :py:data:`None` object.
334+
335+
Recompile your extension to make sure you don't have syntax errors.
336+
You might get a warning that ``spam_system`` is unused.
337+
This is true: we haven't yet added the function to the module.
338+
339+
340+
Method definitions
341+
------------------
342+
343+
To expose the C function to Python, you will need to provide several pieces of
344+
information.
345+
These are collected in a structure called :c:type:`PyMethodDef`,
346+
They are:
347+
348+
* ``ml_name``: the name of the function to use in Python code;
349+
* ``ml_meth``: the *implementation* -- the C function that does what
350+
you need;
351+
* ``ml_flags``: a set of flags describing details like how Python arguments are
352+
passed to the C function; and
353+
* ``ml_doc``: a docstring.
310354

311-
To expose a C function to Python, you need three things:
355+
The name and docstring are encoded as for module.
356+
For ``ml_meth``, we'll use the ``spam_system`` function you just defined.
312357

313-
* The *implementation*: a C function that does what you need, and
314-
* a *name* to use in Python code.
358+
For ``ml_flags``, we'll use :c:data:`METH_O` -- a flag that specifies that a
359+
C function with two ``PyObject*`` arguments (your ``spam_system``) will be
360+
exposed as a Python function taking a single unnamed argument.
361+
(Other flags exist to handle named arguments, but they are harder to use.)
362+
363+
Because modules typically create several functions, these definitions
364+
need to be collected in an array:
365+
366+
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
367+
:start-after: /// Module method table
368+
:end-before: ///
369+
370+
As with module slots, a zero-filled sentinel marks the end of the array.
371+
372+
Add the array to your module, between the ``spam_system`` function and
373+
the module slots.
374+
375+
.. note::
376+
377+
Why :c:type:`!PyMethodDef` and not :c:type:`!PyFunctionDef`?
378+
379+
In the C API, module-level functions behave a bit like "methods"
380+
of a module object: they take an extra *self* argument, and
381+
you need to provide the same information to create.
382+
383+
To add the method(s) defined in a :c:type:`PyMethodDef` array,
384+
add a :c:data:`Py_mod_methods` slot:
385+
386+
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
387+
:start-after: /// Module slot table
388+
:end-before: ///
389+
:emphasize-lines: 6
390+
391+
Recompile your extension again, and test it.
392+
You should now be able to call the function, and get ``None`` from it:
393+
394+
.. code-block:: pycon
395+
396+
>>> import spam
397+
>>> print(spam.system)
398+
<built-in function system>
399+
>>> print(spam.system('whoami'))
400+
None
401+
402+
403+
Returning an integer
404+
====================
405+
406+
Your ``spam_system`` function currently returns ``None``.
407+
We want it to return a number -- that is, a Python :py:type:`int` object.
408+
Ultimately, this will be the exit code of a system command,
409+
but let's start with a fixed value: ``3``.
410+
411+
The Python C API provides a function to create a Python :py:type:`int` object
412+
from a C ``int`` values: :c:func:`PyLong_FromLong`.
413+
(The name might not seem obvious to you; it'll get better as you get used to
414+
C API naming conventions.)
415+
416+
Let's call it:
417+
418+
.. this could be a one-liner, we use 3 lines to show the data types.
419+
420+
.. code-block:: c
421+
:emphasize-lines: 4-6
422+
423+
static PyObject *
424+
spam_system(PyObject *self, PyObject *arg)
425+
{
426+
int status = 3;
427+
PyObject *result = PyLong_FromLong(status);
428+
return result;
429+
}
430+
431+
432+
Recompile and run again, and check that you get a 3:
433+
434+
.. code-block:: pycon
435+
436+
>>> import spam
437+
>>> spam.system('whoami')
438+
3
439+
440+
441+
Accepting a string
442+
==================
315443

316-
You can also add a *dosctring*.
317-
OK, *amongst* the things you need to expose a C function are
444+
Your ``spam_system`` function should accepts one argument,
445+
which is passed in as ``PyObject *arg``.
446+
In order to use the information in it, we will need
447+
to convert it to a C value --- in this case, a C string (``const char *``).
318448

449+
The argument should be a Python string.
450+
There's a slight type mismatch here: Python's :c:type:`str` objects store
451+
Unicode text, but C strings are arrays of bytes.
452+
So, we'll need to *encode* the data.
453+
In our example, we'll use the UTF-8 encoding.
454+
It might not always be correct for system commands, but it's what
455+
:py:meth:`str.decode` uses by default,
456+
and the C API has special support for it.
457+
458+
The function to decode into a UTF-8 buffer is named :c:func:`PyUnicode_AsUTF8`.
459+
Call it like this:
460+
461+
.. code-block:: c
462+
:emphasize-lines: 4
463+
464+
static PyObject *
465+
spam_system(PyObject *self, PyObject *arg)
466+
{
467+
const char *command = PyUnicode_AsUTF8(arg);
468+
int status = 3;
469+
PyObject *result = PyLong_FromLong(status);
470+
return result;
471+
}
472+
473+
If :c:func:`PyUnicode_AsUTF8` is successful, *command* will point to the
474+
resulting array of bytes.
475+
This buffer is managed by the *arg* object,
476+
which means limited:
477+
478+
* You should only use the buffer inside the ``spam_system`` function.
479+
When ``spam_system`` returns, *arg* and the array it manages might be
480+
garbage-collected.
481+
* You must not modify it. This is why we use ``const``.
482+
483+
Both are fine for our use.
484+
485+
If :c:func:`PyUnicode_AsUTF8` was *not* successful, it returns a ``NULL``
486+
pointer.
487+
When using the Python C API, we always need to handle such error cases.
488+
Here, the correct thing to do is returning ``NULL`` also from ``spam_system``:
489+
490+
491+
.. code-block:: c
492+
:emphasize-lines: 5-7
493+
494+
static PyObject *
495+
spam_system(PyObject *self, PyObject *arg)
496+
{
497+
const char *command = PyUnicode_AsUTF8(arg);
498+
if (command == NULL) {
499+
return NULL;
500+
}
501+
int status = 3;
502+
PyObject *result = PyLong_FromLong(status);
503+
return result;
504+
}
505+
506+
That's it for the setup.
507+
Now, all that is left is calling C library function ``system`` with
508+
the ``char *`` buffer, and using its result instead of the ``3``:
509+
510+
.. code-block:: c
511+
:emphasize-lines: 8
512+
513+
static PyObject *
514+
spam_system(PyObject *self, PyObject *arg)
515+
{
516+
const char *command = PyUnicode_AsUTF8(arg);
517+
if (command == NULL) {
518+
return NULL;
519+
}
520+
int status = system(command);
521+
PyObject *result = PyLong_FromLong(status);
522+
return result;
523+
}
524+
525+
Compile it, and test:
526+
527+
.. code-block:: pycon
528+
529+
>>> import spam
530+
>>> result = spam.system('whoami')
531+
User Name
532+
>>> result
533+
0
534+
535+
You might also want to test error cases:
536+
537+
.. code-block:: pycon
538+
539+
>>> import spam
540+
>>> result = spam.system('nonexistent-command')
541+
sh: line 1: nonexistent-command: command not found
542+
>>> result
543+
32512
544+
545+
>>> spam.system(3)
546+
Traceback (most recent call last):
547+
File "<python-input-1>", line 1, in <module>
548+
spam.system(3)
549+
~~~~~~~~~~~^^^
550+
TypeError: bad argument type for built-in operation
551+
552+
553+
The result
554+
==========
555+
556+
557+
Congratulations!
558+
You have written a complete Python C API extension module,
559+
and completed this tutorial!
560+
561+
Here is the entire source file, for your convenience:
562+
563+
.. _extending-spammodule-source:
564+
565+
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
566+
567+
568+
.. rubric:: Footnotes
569+
570+
.. [#why-spam] ``spam`` is the favorite food of Monty Python fans...
571+
572+
573+
Accepting a string
574+
==================
319575

320576

321577

@@ -507,8 +763,8 @@ Our function must then convert the C integer into a Python integer.
507763
This is done using the function :c:func:`PyLong_FromLong`:
508764

509765
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
510-
:start-at: return PyLong_FromLong(status);
511-
:end-at: return PyLong_FromLong(status);
766+
:start-at: PyLong_FromLong
767+
:end-at: return
512768

513769
In this case, it will return a Python :py:class:`int` object.
514770
(Yes, even integers are objects on the heap in Python!)

Doc/includes/capi-extension/spammodule-01.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ spam_system(PyObject *self, PyObject *arg)
1414
return NULL;
1515
}
1616
int status = system(command);
17-
return PyLong_FromLong(status);
17+
PyObject *result = PyLong_FromLong(status);
18+
return result;
1819
}
1920

2021

@@ -38,6 +39,7 @@ PyABIInfo_VAR(abi_info);
3839
/// Module slot table
3940

4041
static PyModuleDef_Slot spam_slots[] = {
42+
{Py_mod_abi, &abi_info},
4143
{Py_mod_name, "spam"},
4244
{Py_mod_doc, PyDoc_STR("A wonderful module with an example function")},
4345
{Py_mod_methods, spam_methods},

0 commit comments

Comments
 (0)