|
1003 | 1003 | Here we want to discuss a few of them. |
1004 | 1004 |
|
1005 | 1005 | The first two tools we will look at are the built-in function \pythonilIdx{filter} and the \pythonilIdx{takewhile}\pythonIdx{itertools!takewhile} function from the \pythonilIdx{itertools} module. |
1006 | | -In the previous section, we implemented a generator function returning the endless sequence of prime numbers in \cref{lst:loops:for_loop_sequence_primes}. |
| 1006 | +In the previous section, we implemented a generator function returning the endless sequence of prime numbers in file \programUrl{iteration:prime_generator} in \cref{lst:iteration:prime_generator}. |
1007 | 1007 | What would we do if we wanted a convenient way to create a list of all prime numbers which are less than 50 by using this never-ending generator? |
1008 | | -The answer can be found in \cref{lst:iteration:filter_takewhile}. |
| 1008 | +The answer can be found in our program \programUrl{iteration:filter_takewhile} given in \cref{lst:iteration:filter_takewhile}. |
1009 | 1009 |
|
1010 | | -\pythonilIdx{takewhile} is a function with two parameters: |
1011 | | -The second parameter is an \pythonilIdx{Iterator}. |
1012 | | -Let's say that it provides a sequence of elements of some type~\pythonil{T}. |
| 1010 | +\pythonilIdx{takewhile} is a function with two parameters~\cite{PSF:P3D:TPSL:IFCIFEL}. |
| 1011 | +The second parameter is an \pythonilIdx{Iterable}. |
| 1012 | +Let's say that this \pythonilIdx{Iterable} provides a sequence of elements of some type~\pythonil{T}. |
1013 | 1013 | The first parameter then is a predicate, which is a function accepting one element of type~\pythonil{T} and returning a \pythonil{bool} value. |
| 1014 | +Then, \pythonilIdx{takewhile} constructs a \emph{new} \pythonilIdx{Iterator}, which returns the elements from the original \pythonilIdx{Iterable} as long as the predicate function returns~\pythonil{True} for them. |
| 1015 | +As soon as it hits an element from the original \pythonil{Iterable} for which the predicate returns \pythonil{False}, it will stop the iteration. |
| 1016 | + |
1014 | 1017 | Back in \cref{sec:functionsAsVarsAndLambdas}, we learned that we can also pass functions or \pythonilsIdx{lambda} as arguments to other functions. |
1015 | | -This is a practical example of that. |
1016 | | -Basically, \pythonilIdx{takewhile} constructs a new \pythonilIdx{Iterator} which returns the elements from the original \pythonilIdx{Iterator} as long as the predicate function returns~\pythonil{True} for them. |
1017 | | -As soon as the predicate returns \pythonil{False}, it will stop the iteration. |
1018 | | -Therefore, the answer to \inQuotes{How can I extract all the numbers less than~50 from the prime sequence?} is simply to call \pythonil{akewhile(lambda z: z < 50, primes())}. |
| 1018 | +This is a practical example where \pythonilsIdx{lambda} come in especially handy. |
| 1019 | +Therefore, the answer to \inQuotes{How can I extract all the numbers less than~50 from the prime sequence?} is simply to call \pythonil{takewhile(lambda z: z < 50, primes())}. |
1019 | 1020 | This sequence is now no longer infinitely long and can conveniently be converted to a \pythonil{list}. |
1020 | 1021 |
|
1021 | | -The built-in function~\pythonilIdx{filter} works quite similarly. |
1022 | | -It, too, accepts a predicate and an~\pythonilIdx{Iterator} as input. |
| 1022 | +The built-in function~\pythonilIdx{filter} works quite similarly~\cite{PSF:P3D:TPSL:BIF}. |
| 1023 | +It, too, accepts a predicate and an~\pythonilIdx{Iterable} as input. |
1023 | 1024 | Different from \pythonil{takewhile}, the new \pythonilIdx{Iterator} created by \pythonilIdx{filter} does not stop if the predicate returns~\pythonil{False}. |
1024 | 1025 | However, it only returns only those elements for which the predicate returned~\pythonil{True}. |
1025 | 1026 | In \cref{lst:iteration:filter_takewhile}, we use this to select prime numbers~$x$ for which an integer~$y$ exists such that~$x=y^2+1$. |
1026 | | -This time, we implement the predicate as function \pythonil{is_sqr_plus_1} and pass this function to \pythonilIdx{filter}. |
1027 | | -Since there again probably infinitely many such prime numbers, we return only those that are less than~1000, for which we use~\pythonil{takewhile}. |
| 1027 | +We again implement the predicate as \pythonilIdx{lambda}. |
| 1028 | +Since there underlying \pythonilIdx{Iterator} given by \pythonil{primes()} is infinite, we again use \pythonilIdx{takewhile} to limit the sequence to those primes that are less than~1000. |
| 1029 | +We pass the result of our construct to the function \pythonilIdx{tuple}, which creates an immutable and indexable sequence. |
1028 | 1030 | The results can be seen in \cref{exec:iteration:filter_takewhile}: |
1029 | 1031 | There are ten such primes. |
1030 | 1032 | The smallest one is $1^2+1=2$ and the largest one is~$26^2+1=677$. |
|
1033 | 1035 | \listingPythonAndOutput{iteration:map}{% |
1034 | 1036 | An example for the function \pythonilIdx{map}.}{}% |
1035 | 1037 | % |
1036 | | -Another important utility function when dealing with sequences is the function~\pythonilIdx{map}. |
1037 | | -We explore its use in \cref{lst:iteration:map}. |
| 1038 | +Another important utility function when dealing with sequences is the built-in function~\pythonilIdx{map}~\cite{PSF:P3D:TPSL:BIF}. |
| 1039 | +We explore its use in \programUrl{iteration:map} given as \cref{lst:iteration:map}. |
1038 | 1040 | Back in \cref{lst:iteration:generator_expressions_to_collection}, we used a generator expression to process data that we exracted from a \pgls{CSV}-formatted string. |
1039 | 1041 | Instead of doing \pythonil{int(s) for s in csv_text.split(\",\")} we can simply write \pythonil{map(int, csv_text.split(\",\")}. |
1040 | | -The first argument to \pythonilIdx{map} is a function that should be applied all of the elements in the sequence passed in as its second argument. |
1041 | | -The result of \pythonilIdx{map} is a new sequence with the return values of this function. |
1042 | | -In \cref{lst:iteration:map}, we map the string \pythonil{csv_text} split at all~\pythonil{\",\"} to \pythonilsIdx{int} and then \pythonilIdx{filter} the sequence to retain only values greater than~20. |
| 1042 | +The first argument to \pythonilIdx{map} is a function that should be applied all of the elements in the \pythonilIdx{Iterable} passed in as its second argument. |
| 1043 | +The result of \pythonilIdx{map} is a new \pythonilIdx{Iterator} with the return values of this function. |
| 1044 | + |
| 1045 | +In \programUrl{iteration:map} given \cref{lst:iteration:map}, we first split \pythonil{csv_text} at all~\pythonil{\",\"}. |
| 1046 | +We then translate the elements of the resulting list to \pythonilsIdx{int} via \pythonilIdx{map}. |
| 1047 | +Finally, we \pythonilIdx{filter} the sequence to retain only values greater than~20. |
1043 | 1048 | We can conveniently iterate over the resulting filtered and mapped sequence using a~\pythonilIdx{for}~loop. |
1044 | 1049 |
|
1045 | | -How about we now obtain all the unique squares of the values in the \pgls{CSV} data, i.e., we discard all duplicate squares. |
| 1050 | +How about we now obtain all the squares of the values in the \pgls{CSV} data, but each value only once. |
| 1051 | +In other words, we discard all duplicates. |
1046 | 1052 | First, we again use \pythonilIdx{split}\pythonIdx{str!split} to divide the text into chunks based on the separator~\pythonil{\",\"}. |
1047 | | -Then we map these chunks to integers and return their squares using the \pythonilIdx{map}~function, but this provide a~\pythonilIdx{lambda} that does the transformation. |
1048 | | -Now we want to retain only the unique values. |
| 1053 | +Then we map these chunks to integers and return their squares using the \pythonilIdx{map}~function, but this time we provide a~\pythonilIdx{lambda} that does the transformation. |
| 1054 | + |
| 1055 | +Now we want to retain only the unique values, i.e., want to get a duplicate-free collection of these numbers. |
1049 | 1056 | This can be done by passing the resulting \pythonilIdx{Iterator} into the \pythonilIdx{set} constructor. |
1050 | 1057 | A set, by definition, only contains unique values. |
1051 | 1058 | In the resulting output in \cref{exec:iteration:map}, we can see that \textil{9}~indeed only appears once and so does~\textil{144}. |
|
1054 | 1061 | In the final example for \pythonilIdx{map}, we have a list~\pythonil{words} of words and want to know the length of the longest word. |
1055 | 1062 | We can first map each word to its length via~\pythonil{map(len, words)}. |
1056 | 1063 | This produces an \pythonilIdx{Iterator} of word lengths, which we can directly pass to~\pythonilIdx{max}. |
| 1064 | +\pythonil{max}~will then iterate over this sequence and return the largest value it encountered. |
1057 | 1065 |
|
1058 | 1066 | Notice that \pythonilIdx{map} does not generate a data structure with all the transformed elements in memory. |
1059 | | -Instead, the elements are constructed as needed (and thereafter disposed by the garbage collection when no longer needed). |
| 1067 | +Instead, the elements are constructed as needed~(and thereafter disposed by the garbage collection when no longer needed). |
1060 | 1068 | This makes \pythonilIdx{map} an elegant and efficient approach to transforming sequences of data. |
1061 | 1069 |
|
1062 | 1070 | \gitLoadPython{iteration:zip}{}{iteration/zip.py}{}% |
|
1068 | 1076 | The output of \pytest\ executing the \pglspl{doctest} for the \pythonilIdx{zip} example from \cref{lst:iteration:zip}.}% |
1069 | 1077 | % |
1070 | 1078 | \begin{sloppypar}% |
1071 | | -As last example for sequence processing we play a bit with the \pythonilIdx{zip} function. |
1072 | | -This function accepts several \pythonilsIdx{Iterables} as argument and returns a new \pythonilIdx{Iterator} which steps through all of input iterables in synch, returning tuples of with one value of each of them. |
1073 | | -For example, \pythonil{zip([1, 2, 3], [\"a\", \"b\", \"c\"])} returns an \pythonilIdx{Iterator} that produces the the sequence~\pythonil{(1, \"a\")}, \pythonil{(2, \"b\")}, and \pythonil{(3, \"c\")}. |
| 1079 | +As last example for sequence processing we play a bit with the \pythonilIdx{zip} function~\cite{PSF:P3D:TPSL:BIF}. |
| 1080 | +This function accepts several \pythonilsIdx{Iterables} as arguments and returns a new \pythonilIdx{Iterator} which steps through all of input iterables in synch, returning tuples of with one value of each of them. |
| 1081 | +For example, \pythonil{zip([1, 2, 3], [\"a\", \"b\", \"c\"])} returns an \pythonilIdx{Iterator} that produces the sequence~\pythonil{(1, \"a\")}, \pythonil{(2, \"b\")}, and \pythonil{(3, \"c\")}. |
1074 | 1082 | Sometimes, the input \pythonilsIdx{Iterable} may be of different length. |
1075 | 1083 | To make sure that such an error is properly reported with a~\pythonilIdx{ValueError}, we must always supply the named argument~\pythonil{strict=True}~\cite{PEP618}.% |
1076 | 1084 | \end{sloppypar}% |
1077 | 1085 | % |
1078 | | -In \cref{lst:iteration:zip}, we use \pythonilIdx{zip} to implement a function \pythonil{distance} that computes the Euclidean distance of two $n$\nobreakdashes-dimensional vectors or points~\pythonil{p1} and~\pythonil{p2}. |
| 1086 | +In \programUrl{iteration:zip}, provided as \cref{lst:iteration:zip}, we use \pythonilIdx{zip} to implement a function \pythonil{distance} that computes the Euclidean distance of two $n$\nobreakdashes-dimensional vectors or points~\pythonil{p1} and~\pythonil{p2}. |
1079 | 1087 | The two points are supplied as \pythonilsIdx{Iterable} of either \pythonil{float} or \pythonil{int}. |
1080 | 1088 | We could, for example, provide them as \pythonils{lists} |
1081 | 1089 | The Euclidean distance is defined as% |
|
1089 | 1097 | This is exactly what \pythonilIdx{zip} does. |
1090 | 1098 | If both points were provides as~\pythonils{list}, then \pythonil{zip(p1, p2, strict=True)} will, step by step, give us the tuples~\pythonil{(p1[0], p2[0])}, \pythonil{(p1[1], p2[1])}, {\dots}, until reaching the ends of the lists. |
1091 | 1099 | We can now write the generator expression~\pythonil{(a - b) ** 2 for a, b in zip(p1, p2, strict=True)}. |
1092 | | -It uses tuple expansion to extract the two elements~\pythonil{a} and \pythonil{b} from each of the tuples that \pythonilIdx{zip} creates. |
| 1100 | +It uses tuple unpacking to extract the two elements~\pythonil{a} and \pythonil{b} from each of the tuples that \pythonilIdx{zip} creates. |
1093 | 1101 | It then computes the square of the difference of these two elements. |
1094 | 1102 | By passing the generator expression to the~\pythonilIdx{sum} function as-is, we can get the sum of these squares. |
1095 | 1103 | Finally, the \pythonilIdx{sqrt} function from the \pythonilIdx{math} completes the computation of the Euclidean distance as prescribed in \cref{eq:euclideanDistance}. |
|
1111 | 1119 | % |
1112 | 1120 | \hsection{Summary}% |
1113 | 1121 | Working with sequences is a very important aspect of \python\ programming. |
1114 | | -The programming language provides a simplified syntax for working with loops in form of list, set, and dict comprehension. |
| 1122 | +The programming language provides a simplified syntax for working with loops in form of list-, set-, and dictionary comprehension. |
1115 | 1123 | Different from comprehension, generator expressions allow us to provide sequences of data that can be processed without storing all elements in memory first or at once. |
1116 | 1124 | Instead, the elements are created when needed. |
1117 | 1125 | If this creation of elements is more complicated than what simple generator expressions can, well, express, we can use generator functions. |
1118 | | -With their \pythonilIdx{yield} statement, they allow us to write functions that perform a computation, pass the result to their output, allow other code outside to process the result, and then resume with the generation of the next element. |
1119 | | -Finally, sequences of data can be processed by aggregating and transforming functions. |
1120 | | -These functions can process containers, comprehensions, generator expressions, and generators alike.% |
| 1126 | +With their \pythonilIdx{yield} statement, they allow us to write functions that perform a computation, pass the result to their output and allow the calling code outside to process them. |
| 1127 | +Different from normal functions, these generator function can then resume their execution until they return more results via \pythonilIdx{yield} or reach the end of their sequence. |
| 1128 | +Sequences of data can be processed by aggregating and transforming functions. |
| 1129 | +These functions can process containers, comprehensions, generator expressions, and generator functions alike. |
| 1130 | +This is possible because all of the sequence \pgls{API} boils down to two basic components: \pythonilIdx{Iterator} and \pythonilIdx{Iterable}. |
| 1131 | +An \pythonilIdx{Iterable} is an interface supported by any object whose elements can be accessed one-by-one. |
| 1132 | +An \pythonilIdx{Iterator} is a single pass, a single such access sequence.% |
1121 | 1133 | \endhsection% |
1122 | 1134 | % |
1123 | 1135 | \endhsection% |
|
0 commit comments