Skip to content

Commit 4e3cfb4

Browse files
ahalbhearsum
authored andcommitted
docs: add thorough documentation for resolve-keyed-by
1 parent ec911c8 commit 4e3cfb4

File tree

3 files changed

+252
-38
lines changed

3 files changed

+252
-38
lines changed

docs/concepts/transforms.rst

Lines changed: 0 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -120,44 +120,6 @@ about the state of the tasks at given points. Here is an example:
120120
In the above example, we can be sure that every task dict has a string field
121121
called ``foo``, and may or may not have a boolean field called ``bar``.
122122

123-
Keyed By
124-
........
125-
126-
Fields in the input tasks can be "keyed by" another value in the task.
127-
For example, a task's ``max-runtime`` may be keyed by ``platform``.
128-
In the task, this looks like:
129-
130-
.. code-block:: yaml
131-
132-
max-runtime:
133-
by-platform:
134-
android: 7200
135-
windows: 3600
136-
default: 1800
137-
138-
This is a simple but powerful way to encode business rules in the tasks
139-
provided as input to the transforms, rather than expressing those rules in the
140-
transforms themselves. The structure is easily resolved to a single value
141-
using the :func:`~taskgraph.util.schema.resolve_keyed_by` utility function:
142-
143-
.. code-block:: python
144-
145-
from taskgraph.util.schema import resolve_keyed_by
146-
147-
@transforms.add
148-
def resolve_max_runtime(config, tasks):
149-
for task in tasks:
150-
# Note that task["label"] is not a standard key, use whatever best
151-
# identifies your task at this stage of the transformation.
152-
resolve_keyed_by(task, "max-runtime", task["label"])
153-
yield task
154-
155-
Exact matches are used immediately. If no exact matches are found, each
156-
alternative is treated as a regular expression, matched against the whole
157-
value. Thus ``android.*`` would match ``android-arm/debug``. If nothing
158-
matches as a regular expression, but there is a ``default`` alternative, it is
159-
used. Otherwise, an exception is raised and graph generation stops.
160-
161123
Organization
162124
-------------
163125

docs/howto/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ A collection of how-to guides.
1010
run-locally
1111
debugging
1212
bootstrap-taskgraph
13+
resolve-keyed-by
1314
use-fetches
1415
docker
1516
create-actions

docs/howto/resolve-keyed-by.rst

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
Use Keyed By
2+
============
3+
4+
Often fields in a task can depend on other values in the task. For example, a
5+
task's ``max-runtime`` may depend on the ``platform``. To handle this, you
6+
could re-define ``max-runtime`` in each task's definition like so:
7+
8+
.. code-block:: yaml
9+
10+
tasks:
11+
taskA:
12+
platform: android
13+
worker:
14+
max-runtime: 7200
15+
16+
taskB:
17+
platform: ios
18+
worker:
19+
max-runtime: 7200
20+
21+
taskC:
22+
platform: windows
23+
worker:
24+
max-runtime: 3600
25+
26+
taskD:
27+
platform: mac
28+
worker:
29+
max-runtime: 1800
30+
31+
...
32+
33+
This is simple, but if you have lots of tasks it's also tedious and makes
34+
updating the configuration a pain. To avoid this duplication you could use a
35+
:doc:`transform </concepts/transforms>`:
36+
37+
.. code-block:: python
38+
39+
@transforms.add
40+
def set_max_runtime(config, tasks):
41+
for task in tasks:
42+
if task["platform"] in ("android", "ios"):
43+
task["worker"]["max-runtime"] = 7200
44+
elif task["platform"] == "windows":
45+
task["worker"]["max-runtime"] = 3600
46+
else:
47+
task["worker"]["max-runtime"] = 1800
48+
49+
yield task
50+
51+
This works but now we've hardcoded constants into our code logic far away from
52+
the task's original definition! Besides this is pretty verbose and it can get
53+
complicated if you want to be able to change these constants per task.
54+
55+
An Alternative Approach
56+
-----------------------
57+
58+
Another way to accomplish the same thing is to use Taskgraph's "keyed by"
59+
feature. This can be used in combination with the ``task-defaults`` key to
60+
express the same logic directly in the ``kind.yml`` file:
61+
62+
.. code-block:: yaml
63+
64+
task-defaults:
65+
worker:
66+
max-runtime:
67+
by-platform:
68+
(ios|android): 7200
69+
windows: 3600
70+
default: 1800
71+
72+
tasks:
73+
taskA:
74+
platform: android
75+
76+
taskB:
77+
platform: windows
78+
79+
taskC:
80+
platform: mac
81+
82+
...
83+
84+
85+
The structure under the ``by-platform`` key is resolved to a single value using
86+
the :func:`~taskgraph.util.schema.resolve_keyed_by` utility function. When
87+
"keying by" another attribute in the task, you must call this utility later on
88+
in a transform:
89+
90+
.. code-block:: python
91+
92+
from taskgraph.util.schema import resolve_keyed_by
93+
94+
@transforms.add
95+
def resolve_max_runtime(config, tasks):
96+
for task in tasks:
97+
resolve_keyed_by(task, "worker.max-runtime", f"Task {task['label']")
98+
yield task
99+
100+
In this example, :func:`~taskgraph.util.schema.resolve_keyed_by` takes the root
101+
container object (aka, the task), the subkey to operate on, and a descriptor
102+
that will be used in any exceptions that get raised.
103+
104+
Exact matches are used immediately. If no exact matches are found, each
105+
alternative is treated as a regular expression, matched against the whole
106+
value. Thus ``android.*`` would match ``android-arm/debug``. If nothing
107+
matches as a regular expression, but there is a ``default`` alternative, it is
108+
used. Otherwise, an exception is raised and graph generation stops.
109+
110+
Passing Additional Context
111+
--------------------------
112+
113+
By default when you use the pattern ``by-<name>`` and then feed it into
114+
:func:`~taskgraph.util.schema.resolve_keyed_by`, ``<name>`` is assumed to be a
115+
valid top-level key in the task definition. However, sometimes you want to key
116+
by some other value that is either nested deeper in the task definition, or not
117+
even known ahead of time!
118+
119+
For this reason you can specify additional context via ``**kwargs``. Typically
120+
it will make the most sense to use this following a prior transform that sets
121+
some value that's not known statically. This comes up frequently when splitting
122+
a task from one definition into several. For example:
123+
124+
.. code-block:: yaml
125+
126+
tasks:
127+
task:
128+
platforms: [android, windows, mac]
129+
worker:
130+
max-runtime:
131+
by-platform:
132+
(ios|android): 7200
133+
windows: 3600
134+
default: 1800
135+
136+
.. code-block:: python
137+
138+
@transforms.add
139+
def split_platforms(config, tasks):
140+
for task in tasks:
141+
for platform in task.pop("platforms"):
142+
new_task = deepcopy(task)
143+
# ...
144+
resolve_keyed_by(
145+
new_task,
146+
"worker.max-runtime",
147+
task["label"],
148+
platform=platform,
149+
)
150+
yield new_task
151+
152+
Here we did not know the value of "platform" ahead of time, but it was still
153+
possible to use it in a "keyed by" statement thanks to the ability to pass in
154+
extra context.
155+
156+
.. note::
157+
A good rule of thumb is to only consider using "keyed by" in
158+
``task-defaults`` or in a task definition that will be split into many
159+
tasks down the line.
160+
161+
Specifying the Subkey
162+
---------------------
163+
164+
The subkey in :func:`~taskgraph.util.schema.resolve_keyed_by` is expressed in
165+
dot path notation with each part of the path representing a nested dictionary.
166+
If any part of the subkey is a list, each item in the list will be operated on.
167+
For example, consider this excerpt of a task definition:
168+
169+
.. code-block:: yaml
170+
171+
worker:
172+
artifacts:
173+
- name: foo
174+
path:
175+
by-platform:
176+
windows: foo.zip
177+
default: foo.tar.gz
178+
- name: bar
179+
path:
180+
by-platform:
181+
windows: bar.zip
182+
default: bar.tar.gz
183+
184+
With the associated transform:
185+
186+
.. code-block:: python
187+
188+
@transforms.add
189+
def resolve_artifact_paths(config, tasks):
190+
for task in tasks:
191+
resolve_keyed_by(task, "worker.artifacts.path", task["label"])
192+
yield task
193+
194+
In this example, Taskgraph resolves ``by-platform`` in both the *foo* and *bar*
195+
artifacts.
196+
197+
.. note::
198+
Calling ``resolve_keyed_by`` on a subkey that doesn't contain a ``by-*``
199+
field is a no-op.
200+
201+
Creating Schemas with Keyed By
202+
------------------------------
203+
204+
Having fields of a task that may or may not be keyed by another field, can cause
205+
problems for any schemas your transforms define. For that reason Taskgraph provides
206+
the :func:`~taskgraph.util.schema.optionally_keyed_by` utility function.
207+
208+
It can be used to generate a valid schema that allows a field to either use
209+
"keyed by" or not. For example:
210+
211+
.. code-block:: python
212+
213+
from taskgraph.util.schema import Schema, optionally_keyed_by
214+
215+
216+
schema = Schema({
217+
# ...
218+
Optional("worker"): {
219+
Optional("max-run-time"): optionally_keyed_by("platform", int),
220+
},
221+
})
222+
223+
transforms.add_validate(schema)
224+
225+
The example above allows both of the following task definitions:
226+
227+
.. code-block:: yaml
228+
229+
taskA:
230+
worker:
231+
max-run-time: 3600
232+
233+
taskB:
234+
worker:
235+
max-run-time:
236+
by-platform:
237+
windows: 7200
238+
default: 3600
239+
240+
If there are more than one fields that another field may be keyed by, it
241+
can be specified like this:
242+
243+
.. code-block:: python
244+
245+
Optional("max-run-time"): optionally_keyed_by("platform", "build-type", int)
246+
247+
248+
In this example either ``by-platform`` or ``by-build-type`` may be used. You
249+
may specify as many fields as you like this way, as long as the last argument to
250+
:func:`~taskgraph.util.schema.optionally_keyed_by` is the type of the field
251+
after resolving is finished (or if keyed by is unused).

0 commit comments

Comments
 (0)