Optimize graph traversal by using a more appropriate algorithm #823

Eijebong · 2025-10-17T10:34:18Z

The previous implementation could visit nodes multiple times when a graph had diamond like patterns. The new implementation uses Kahn's algorithm which ensures that we only ever visit nodes once.

I tested this on firefox-main with ./mach taskgraph full -J -p taskcluster/test/params/mc-onpush.yml. Before this change, with a profiler attached, it would take ~3mn55. After this change, we're down to ~2mn30. Without profiling we go from 22s to 17s.

I added a links_and_reverse_links_dict helper here since computing them both at the same time is faster than iterating over the graph twice. In my testing the function was called 11 times, 10 of which were on the exact same graph. Because that function is very expensive to
call, I chose to cache it.

The caching itself resulted in ~5-6s of time save when profiled on the full graph for firefox and ~1s when not profiled.

Hyperfine output, full firefox graph, no profiler attached:

Benchmark 1: baseline
  Time (mean ± σ):     22.565 s ±  0.260 s    [User: 53.205 s, System: 5.034 s]
  Range (min … max):   22.255 s … 23.043 s    10 runs

Benchmark 2: with-patch
  Time (mean ± σ):     19.032 s ±  0.110 s    [User: 49.556 s, System: 4.829 s]
  Range (min … max):   18.856 s … 19.195 s    10 runs

Benchmark 3: with-patch-cache
  Time (mean ± σ):     17.800 s ±  0.250 s    [User: 48.605 s, System: 4.893 s]
  Range (min … max):   17.281 s … 18.159 s    10 runs

Summary
  with-patch-cache ran
    1.07 ± 0.02 times faster than with-patch
    1.27 ± 0.02 times faster than baseline

Because type annotations in python are a "I wish people would respect my authority" kind of feature, we cannot guarantee that a `Graph` is actually created with frozensets as attributes. That is the case all throughout taskgraph where we're passing sets. By forcing those attributes to be frozensets we can ensure that `Graph` is hashable which will allow us to use `functools.lru_cache` to cache some expensive functions in here. Unfortunately, there's no easy way of transforming the inputs passed into a dataclass, since in both `__init__` and `__post_init__`, the structure is already frozen. By using a dataclass as the parent of `Graph` instead of `Graph` itself, we can work around the issue while keeping the frozenness of the dataclass post init without having to resort to working around dataclass when transforming the attributes

The previous implementation could visit nodes multiple times when a graph had diamond like patterns. The new implementation uses [Kahn's algorithm](https://en.wikipedia.org/wiki/Topological_sorting#Kahn's_algorithm) which ensures that we only ever visit nodes once. I tested this on firefox-main with `./mach taskgraph full -J -p taskcluster/test/params/mc-onpush.yml`. Before this change, with a profiler attached, it would take ~3mn55. After this change, we're down to ~2mn30. Without profiling we go from 22s to 17s. I added a `links_and_reverse_links_dict` helper here since computing them both at the same time is faster than iterating over the graph twice. In my testing the function was called 11 times, 10 of which were on the exact same graph. Because that function is very expensive to call, I chose to cache it. The caching itself resulted in ~5-6s of time save when profiled on the full graph for firefox and ~1s when not profiled. Hyperfine output, full firefox graph, no profiler attached: ``` Benchmark 1: baseline Time (mean ± σ): 22.565 s ± 0.260 s [User: 53.205 s, System: 5.034 s] Range (min … max): 22.255 s … 23.043 s 10 runs Benchmark 2: with-patch Time (mean ± σ): 19.032 s ± 0.110 s [User: 49.556 s, System: 4.829 s] Range (min … max): 18.856 s … 19.195 s 10 runs Benchmark 3: with-patch-cache Time (mean ± σ): 17.800 s ± 0.250 s [User: 48.605 s, System: 4.893 s] Range (min … max): 17.281 s … 18.159 s 10 runs Summary with-patch-cache ran 1.07 ± 0.02 times faster than with-patch 1.27 ± 0.02 times faster than baseline ```

jcristau

Very nice :)

Eijebong requested a review from a team as a code owner October 17, 2025 10:34

Eijebong requested a review from ahal October 17, 2025 10:34

Eijebong force-pushed the faster-graph-traversal branch 7 times, most recently from 0d85f23 to ceb835e Compare October 17, 2025 13:29

Eijebong added 2 commits October 17, 2025 15:31

Eijebong force-pushed the faster-graph-traversal branch from ceb835e to 338c4f9 Compare October 17, 2025 13:31

jcristau approved these changes Oct 17, 2025

View reviewed changes

Eijebong merged commit b713497 into taskcluster:main Oct 17, 2025
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize graph traversal by using a more appropriate algorithm #823

Optimize graph traversal by using a more appropriate algorithm #823

Uh oh!

Eijebong commented Oct 17, 2025 •

edited

Loading

Uh oh!

jcristau left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize graph traversal by using a more appropriate algorithm #823

Optimize graph traversal by using a more appropriate algorithm #823

Uh oh!

Conversation

Eijebong commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcristau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Eijebong commented Oct 17, 2025 •

edited

Loading