@@ -171,7 +171,10 @@ Now, build install the *project in the current directory* (``.``) via ``pip``:
171171
172172.. code-block :: sh
173173
174- python -m pip install .
174+ python -m pip -v install .
175+
176+ The ``-v `` (``--verbose ``) option causes ``pip `` to show the output from
177+ the compiler, which is often useful during development.
175178
176179.. tip ::
177180
@@ -460,7 +463,7 @@ So, we'll need to *encode* the data, and we'll use the UTF-8 encoding for it.
460463and the C API has special support for it.)
461464
462465The function to encode a Python string into a UTF-8 buffer is named
463- :c:func: `PyUnicode_AsUTF8 ` [#why-pyunicodeasutf8 ]_.
466+ :c:func: `PyUnicode_AsUTF8AndSize ` [#why-pyunicodeasutf8 ]_.
464467Call it like this:
465468
466469.. code-block :: c
@@ -469,31 +472,31 @@ Call it like this:
469472 static PyObject *
470473 spam_system(PyObject *self, PyObject *arg)
471474 {
472- const char *command = PyUnicode_AsUTF8 (arg);
475+ const char *command = PyUnicode_AsUTF8AndSize (arg, NULL );
473476 int status = 3;
474477 PyObject *result = PyLong_FromLong(status);
475478 return result;
476479 }
477480
478- If :c:func: `PyUnicode_AsUTF8 ` is successful, *command * will point to the
479- resulting array of bytes.
481+ If :c:func: `PyUnicode_AsUTF8AndSize ` is successful, *command * will point to the
482+ resulting C string -- a zero-terminated array of bytes [ #embedded-nul ]_ .
480483This buffer is managed by the *arg * object, which means we don't need to free
481484it, but we must follow some rules:
482485
483486* We should only use the buffer inside the ``spam_system `` function.
484- When ``spam_system `` returns, *arg * and the buffer it manages might be
487+ After ``spam_system `` returns, *arg * and the buffer it manages might be
485488 garbage-collected.
486489* We must not modify it. This is why we use ``const ``.
487490
488- If :c:func: `PyUnicode_AsUTF8 ` was *not * successful, it returns a ``NULL ``
491+ If :c:func: `PyUnicode_AsUTF8AndSize ` was *not * successful, it returns a ``NULL ``
489492pointer.
490493When calling *any * Python C API, we always need to handle such error cases.
491494The way to do this in general is left for later chapters of this documentation.
492495For now, be assured that we are already handling errors from
493496:c:func: `PyLong_FromLong ` correctly.
494497
495- For the :c:func: `PyUnicode_AsUTF8 ` call, the correct way to handle errors is
496- returning ``NULL `` from ``spam_system ``.
498+ For the :c:func: `PyUnicode_AsUTF8AndSize ` call, the correct way to handle
499+ errors is returning ``NULL `` from ``spam_system ``.
497500Add an ``if `` block for this:
498501
499502
@@ -503,7 +506,7 @@ Add an ``if`` block for this:
503506 static PyObject *
504507 spam_system(PyObject *self, PyObject *arg)
505508 {
506- const char *command = PyUnicode_AsUTF8 (arg);
509+ const char *command = PyUnicode_AsUTF8AndSize (arg);
507510 if (command == NULL) {
508511 return NULL;
509512 }
@@ -512,7 +515,18 @@ Add an ``if`` block for this:
512515 return result;
513516 }
514517
515- That's it for the setup.
518+ To test that error handling works, compile again, restart Python so that
519+ ``import spam `` picks up the new version of your module, and try passing
520+ a non-string value to your function:
521+
522+ .. code-block :: pycon
523+
524+ >>> import spam
525+ >>> spam.system(3)
526+ Traceback (most recent call last):
527+ ...
528+ TypeError: bad argument type for built-in operation
529+
516530 Now, all that is left is calling the C library function :c:func: `system ` with
517531the ``char * `` buffer, and using its result instead of the ``3 ``:
518532
@@ -522,7 +536,7 @@ the ``char *`` buffer, and using its result instead of the ``3``:
522536 static PyObject *
523537 spam_system(PyObject *self, PyObject *arg)
524538 {
525- const char *command = PyUnicode_AsUTF8 (arg);
539+ const char *command = PyUnicode_AsUTF8AndSize (arg);
526540 if (command == NULL) {
527541 return NULL;
528542 }
@@ -543,7 +557,8 @@ system command:
543557 >>> result
544558 0
545559
546- You might also want to test error cases:
560+ You can also test with other commands, like ``ls ``, ``dir ``, or one
561+ that doesn't exist:
547562
548563.. code-block :: pycon
549564
@@ -553,11 +568,6 @@ You might also want to test error cases:
553568 >>> result
554569 32512
555570
556- >>> spam.system(3)
557- Traceback (most recent call last):
558- ...
559- TypeError: bad argument type for built-in operation
560-
561571
562572 The result
563573==========
@@ -665,3 +675,13 @@ on :py:attr:`sys.path`.
665675 type.
666676 .. [#why-pyunicodeasutf8 ] Here, ``PyUnicode `` refers to the original name of
667677 the Python :py:class: `str ` class: ``unicode ``.
678+
679+ The ``AndSize `` part of the name refers to the fact that this function can
680+ also retrieve the size of the buffer, using an output argument.
681+ We don't need this, so we set the second argument to NULL.
682+ .. [#embedded-nul ] We're ignoring the fact that Python strings can also
683+ contain NUL bytes, which terminate a C string.
684+ In other words, our function will treat ``spam.system("foo\0bar") `` as
685+ ``spam.system("foo") ``.
686+ This possibility can lead to security issues, so the real ``os.system ``
687+ function size checks for this case and raises an error.
0 commit comments