|
| 1 | +Use Keyed By |
| 2 | +============ |
| 3 | + |
| 4 | +Often fields in a task can depend on other values in the task. For example, a |
| 5 | +task's ``max-runtime`` may depend on the ``platform``. To handle this, you |
| 6 | +could re-define ``max-runtime`` in each task's definition like so: |
| 7 | + |
| 8 | +.. code-block:: yaml |
| 9 | +
|
| 10 | + tasks: |
| 11 | + taskA: |
| 12 | + platform: android |
| 13 | + worker: |
| 14 | + max-runtime: 7200 |
| 15 | +
|
| 16 | + taskB: |
| 17 | + platform: ios |
| 18 | + worker: |
| 19 | + max-runtime: 7200 |
| 20 | +
|
| 21 | + taskC: |
| 22 | + platform: windows |
| 23 | + worker: |
| 24 | + max-runtime: 3600 |
| 25 | +
|
| 26 | + taskD: |
| 27 | + platform: mac |
| 28 | + worker: |
| 29 | + max-runtime: 1800 |
| 30 | +
|
| 31 | + ... |
| 32 | +
|
| 33 | +This is simple, but if you have lots of tasks it's also tedious and makes |
| 34 | +updating the configuration a pain. To avoid this duplication you could use a |
| 35 | +:doc:`transform </concepts/transforms>`: |
| 36 | + |
| 37 | +.. code-block:: python |
| 38 | +
|
| 39 | + @transforms.add |
| 40 | + def set_max_runtime(config, tasks): |
| 41 | + for task in tasks: |
| 42 | + if task["platform"] in ("android", "ios"): |
| 43 | + task["worker"]["max-runtime"] = 7200 |
| 44 | + elif task["platform"] == "windows": |
| 45 | + task["worker"]["max-runtime"] = 3600 |
| 46 | + else: |
| 47 | + task["worker"]["max-runtime"] = 1800 |
| 48 | +
|
| 49 | + yield task |
| 50 | +
|
| 51 | +This works but now we've hardcoded constants into our code logic far away from |
| 52 | +the task's original definition! Besides this is pretty verbose and it can get |
| 53 | +complicated if you want to be able to change these constants per task. |
| 54 | + |
| 55 | +An Alternative Approach |
| 56 | +----------------------- |
| 57 | + |
| 58 | +Another way to accomplish the same thing is to use Taskgraph's "keyed by" |
| 59 | +feature. This can be used in combination with the ``task-defaults`` key to |
| 60 | +express the same logic directly in the ``kind.yml`` file: |
| 61 | + |
| 62 | +.. code-block:: yaml |
| 63 | +
|
| 64 | + task-defaults: |
| 65 | + worker: |
| 66 | + max-runtime: |
| 67 | + by-platform: |
| 68 | + (ios|android): 7200 |
| 69 | + windows: 3600 |
| 70 | + default: 1800 |
| 71 | +
|
| 72 | + tasks: |
| 73 | + taskA: |
| 74 | + platform: android |
| 75 | +
|
| 76 | + taskB: |
| 77 | + platform: windows |
| 78 | +
|
| 79 | + taskC: |
| 80 | + platform: mac |
| 81 | +
|
| 82 | + ... |
| 83 | +
|
| 84 | +
|
| 85 | +The structure under the ``by-platform`` key is resolved to a single value using |
| 86 | +the :func:`~taskgraph.util.schema.resolve_keyed_by` utility function. When |
| 87 | +"keying by" another attribute in the task, you must call this utility later on |
| 88 | +in a transform: |
| 89 | + |
| 90 | +.. code-block:: python |
| 91 | +
|
| 92 | + from taskgraph.util.schema import resolve_keyed_by |
| 93 | +
|
| 94 | + @transforms.add |
| 95 | + def resolve_max_runtime(config, tasks): |
| 96 | + for task in tasks: |
| 97 | + resolve_keyed_by(task, "worker.max-runtime", f"Task {task['label']") |
| 98 | + yield task |
| 99 | +
|
| 100 | +In this example, :func:`~taskgraph.util.schema.resolve_keyed_by` takes the root |
| 101 | +container object (aka, the task), the subkey to operate on, and a descriptor |
| 102 | +that will be used in any exceptions that get raised. |
| 103 | +
|
| 104 | +Exact matches are used immediately. If no exact matches are found, each |
| 105 | +alternative is treated as a regular expression, matched against the whole |
| 106 | +value. Thus ``android.*`` would match ``android-arm/debug``. If nothing |
| 107 | +matches as a regular expression, but there is a ``default`` alternative, it is |
| 108 | +used. Otherwise, an exception is raised and graph generation stops. |
| 109 | +
|
| 110 | +Passing Additional Context |
| 111 | +-------------------------- |
| 112 | +
|
| 113 | +By default when you use the pattern ``by-<name>`` and then feed it into |
| 114 | +:func:`~taskgraph.util.schema.resolve_keyed_by`, ``<name>`` is assumed to be a |
| 115 | +valid top-level key in the task definition. However, sometimes you want to key |
| 116 | +by some other value that is either nested deeper in the task definition, or not |
| 117 | +even known ahead of time! |
| 118 | +
|
| 119 | +For this reason you can specify additional context via ``**kwargs``. Typically |
| 120 | +it will make the most sense to use this following a prior transform that sets |
| 121 | +some value that's not known statically. This comes up frequently when splitting |
| 122 | +a task from one definition into several. For example: |
| 123 | +
|
| 124 | +.. code-block:: yaml |
| 125 | +
|
| 126 | + tasks: |
| 127 | + task: |
| 128 | + platforms: [android, windows, mac] |
| 129 | + worker: |
| 130 | + max-runtime: |
| 131 | + by-platform: |
| 132 | + (ios|android): 7200 |
| 133 | + windows: 3600 |
| 134 | + default: 1800 |
| 135 | +
|
| 136 | +.. code-block:: python |
| 137 | +
|
| 138 | + @transforms.add |
| 139 | + def split_platforms(config, tasks): |
| 140 | + for task in tasks: |
| 141 | + for platform in task.pop("platforms"): |
| 142 | + new_task = deepcopy(task) |
| 143 | + # ... |
| 144 | + resolve_keyed_by( |
| 145 | + new_task, |
| 146 | + "worker.max-runtime", |
| 147 | + task["label"], |
| 148 | + platform=platform, |
| 149 | + ) |
| 150 | + yield new_task |
| 151 | +
|
| 152 | +Here we did not know the value of "platform" ahead of time, but it was still |
| 153 | +possible to use it in a "keyed by" statement thanks to the ability to pass in |
| 154 | +extra context. |
| 155 | +
|
| 156 | +.. note:: |
| 157 | + A good rule of thumb is to only consider using "keyed by" in |
| 158 | + ``task-defaults`` or in a task definition that will be split into many |
| 159 | + tasks down the line. |
| 160 | +
|
| 161 | +Specifying the Subkey |
| 162 | +--------------------- |
| 163 | +
|
| 164 | +The subkey in :func:`~taskgraph.util.schema.resolve_keyed_by` is expressed in |
| 165 | +dot path notation with each part of the path representing a nested dictionary. |
| 166 | +If any part of the subkey is a list, each item in the list will be operated on. |
| 167 | +For example, consider this excerpt of a task definition: |
| 168 | +
|
| 169 | +.. code-block:: yaml |
| 170 | +
|
| 171 | + worker: |
| 172 | + artifacts: |
| 173 | + - name: foo |
| 174 | + path: |
| 175 | + by-platform: |
| 176 | + windows: foo.zip |
| 177 | + default: foo.tar.gz |
| 178 | + - name: bar |
| 179 | + path: |
| 180 | + by-platform: |
| 181 | + windows: bar.zip |
| 182 | + default: bar.tar.gz |
| 183 | +
|
| 184 | +With the associated transform: |
| 185 | +
|
| 186 | +.. code-block:: python |
| 187 | +
|
| 188 | + @transforms.add |
| 189 | + def resolve_artifact_paths(config, tasks): |
| 190 | + for task in tasks: |
| 191 | + resolve_keyed_by(task, "worker.artifacts.path", task["label"]) |
| 192 | + yield task |
| 193 | +
|
| 194 | +In this example, Taskgraph resolves ``by-platform`` in both the *foo* and *bar* |
| 195 | +artifacts. |
| 196 | +
|
| 197 | +.. note:: |
| 198 | + Calling ``resolve_keyed_by`` on a subkey that doesn't contain a ``by-*`` |
| 199 | + field is a no-op. |
| 200 | +
|
| 201 | +Creating Schemas with Keyed By |
| 202 | +------------------------------ |
| 203 | +
|
| 204 | +Having fields of a task that may or may not be keyed by another field, can cause |
| 205 | +problems for any schemas your transforms define. For that reason Taskgraph provides |
| 206 | +the :func:`~taskgraph.util.schema.optionally_keyed_by` utility function. |
| 207 | +
|
| 208 | +It can be used to generate a valid schema that allows a field to either use |
| 209 | +"keyed by" or not. For example: |
| 210 | +
|
| 211 | +.. code-block:: python |
| 212 | +
|
| 213 | + from taskgraph.util.schema import Schema, optionally_keyed_by |
| 214 | +
|
| 215 | +
|
| 216 | + schema = Schema({ |
| 217 | + # ... |
| 218 | + Optional("worker"): { |
| 219 | + Optional("max-run-time"): optionally_keyed_by("platform", int), |
| 220 | + }, |
| 221 | + }) |
| 222 | +
|
| 223 | + transforms.add_validate(schema) |
| 224 | +
|
| 225 | +The example above allows both of the following task definitions: |
| 226 | +
|
| 227 | +.. code-block:: yaml |
| 228 | +
|
| 229 | + taskA: |
| 230 | + worker: |
| 231 | + max-run-time: 3600 |
| 232 | +
|
| 233 | + taskB: |
| 234 | + worker: |
| 235 | + max-run-time: |
| 236 | + by-platform: |
| 237 | + windows: 7200 |
| 238 | + default: 3600 |
| 239 | +
|
| 240 | +If there are more than one fields that another field may be keyed by, it |
| 241 | +can be specified like this: |
| 242 | +
|
| 243 | +.. code-block:: python |
| 244 | +
|
| 245 | + Optional("max-run-time"): optionally_keyed_by("platform", "build-type", int) |
| 246 | +
|
| 247 | +
|
| 248 | +In this example either ``by-platform`` or ``by-build-type`` may be used. You |
| 249 | +may specify as many fields as you like this way, as long as the last argument to |
| 250 | +:func:`~taskgraph.util.schema.optionally_keyed_by` is the type of the field |
| 251 | +after resolving is finished (or if keyed by is unused). |
0 commit comments