Skip to content

Contractor rewrite#7442

Open
MarcelloPerathoner wants to merge 3 commits intoProject-OSRM:masterfrom
MarcelloPerathoner:contractor-rewrite
Open

Contractor rewrite#7442
MarcelloPerathoner wants to merge 3 commits intoProject-OSRM:masterfrom
MarcelloPerathoner:contractor-rewrite

Conversation

@MarcelloPerathoner
Copy link
Copy Markdown
Contributor

Issue

The shortest distance between any two nodes must be invariant under contraction. We must also keep invariant the shortest loop distance from any node back to itself. This requirement arises for us from the need to "go around" if source and target are on the same node, with source downstream from target. For this reason we must insert self-loops whenever this is the shortest path from self to self.

The current implementation only inserts self-loops for one-way streets, which is insufficient. Any road can be turned into an implicit one-way by any combination of turn restrictions, bearings, or continue-straight-at-waypoints.

Tasklist

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reworks the contraction-hierarchy (CH) contractor to preserve shortest-path invariants for node-to-node distances and node-to-self loop distances by inserting self-loop shortcuts when appropriate.

Changes:

  • Refactors the contractor implementation (parallel independent-node contraction, shortcut insertion, core handling) and updates APIs/call sites accordingly.
  • Extends routing/assembly logic to consider self-loops in edge-based routing, and adds/adjusts unit + BDD tests.
  • Adds heap/storage enhancements (LinearHashStorage option, IncreaseKey support) and introduces new debug/benchmark tooling/logging.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
unit_tests/util/query_heap.cpp Expands heap storage backends tested (adds LinearHashStorage) and tweaks test sizes.
unit_tests/contractor/graph_contractor.cpp Updates tests for new contractor APIs and adds assertions for self-loop behavior.
src/tools/routed.cpp Makes std::exception handler unconditional (not Windows-only).
src/engine/routing_algorithms/routing_base_ch.cpp Adds debug logging and keeps packed-leg handling for loops.
src/contractor/graph_contractor.cpp Major contractor rewrite: parallel independent contraction, shortcut insertion, self-loop handling, updated TBB usage, instrumentation.
src/contractor/contractor.cpp Updates contraction pipeline to new contractor APIs (drops node_weights from calls).
src/contractor/contractor_search.cpp Updates contractor search to new heap data model (bool target flag) and new search signature.
scripts/debug/dump_hsgr.py New script to inspect .ebg/.hsgr and optionally emit DOT graphs.
scripts/contractor_benchmark.py Adds benchmark script documentation header.
package-lock.json Updates npm dependencies (brace-expansion, yaml).
include/util/query_heap.hpp Adds IncreaseKey support.
include/util/dynamic_graph.hpp Allows Renumber({}) to mean “compress edges only” without node renumbering.
include/util/d_ary_heap.hpp Adds heap “increase” operation.
include/engine/routing_algorithms/routing_base_ch.hpp Adjusts routing step logic to account for self-loops; adds debug logging in debug builds.
include/engine/guidance/assemble_steps.hpp Adds debug logging around RouteStep creation; adds assertion debugging.
include/engine/guidance/assemble_leg.hpp Adds util/log include (currently unused).
include/contractor/graph_contractor.hpp Updates contractor APIs (removes node_weights parameters).
include/contractor/contractor_search.hpp Updates search signature (adds start node parameter).
include/contractor/contractor_heap.hpp Simplifies heap payload to a boolean “is target” flag.
include/contractor/contractor_graph.hpp Changes edge id type to NodeID and documents originalEdges meaning.
features/support/run.js Uses close instead of exit for child process completion and clarifies log messages.
features/lib/osrm_loader.js Uses close instead of exit and logs stdout; drains streams on close in one loader.
features/foot/distance.feature Adds additional distance-metric routing assertions (more pair directions).
features/bicycle/distance.feature Adds additional distance-metric routing assertions (more pair directions).
Comments suppressed due to low confidence (1)

include/contractor/contractor_search.hpp:21

  • The parameter is named contractable here, but the implementation (src/contractor/contractor_search.cpp) uses the same argument as a “contractible” mask (nodes allowed to be traversed). Renaming this parameter to contractible (and matching call sites) would reduce confusion, especially now that both spellings appear in nearby code.
void search(ContractorHeap &heap,
            const ContractorGraph &graph,
            const NodeID start,
            const std::vector<bool> &contractable,
            const unsigned number_of_targets,
            const int node_limit,
            const EdgeWeight weight_limit,
            const NodeID forbidden_node);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +15 to +22
#include <oneapi/tbb/blocked_range.h>
#include <oneapi/tbb/concurrent_priority_queue.h>
#include <oneapi/tbb/concurrent_vector.h>
#include <oneapi/tbb/enumerable_thread_specific.h>
#include <oneapi/tbb/parallel_for.h>
#include <oneapi/tbb/parallel_for_each.h>
#include <oneapi/tbb/parallel_invoke.h>
#include <oneapi/tbb/parallel_sort.h>
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file switches to <oneapi/tbb/...> headers while the rest of the codebase still includes classic <tbb/...> (e.g. src/contractor/contractor.cpp, include/contractor/contracted_edge_container.hpp). This inconsistency can break builds depending on which TBB header layout is available. Please align to the project’s chosen include path (either keep <tbb/...> here, or migrate the rest of the repo/build config consistently).

Suggested change
#include <oneapi/tbb/blocked_range.h>
#include <oneapi/tbb/concurrent_priority_queue.h>
#include <oneapi/tbb/concurrent_vector.h>
#include <oneapi/tbb/enumerable_thread_specific.h>
#include <oneapi/tbb/parallel_for.h>
#include <oneapi/tbb/parallel_for_each.h>
#include <oneapi/tbb/parallel_invoke.h>
#include <oneapi/tbb/parallel_sort.h>
#include <tbb/blocked_range.h>
#include <tbb/concurrent_priority_queue.h>
#include <tbb/concurrent_vector.h>
#include <tbb/enumerable_thread_specific.h>
#include <tbb/parallel_for.h>
#include <tbb/parallel_for_each.h>
#include <tbb/parallel_invoke.h>
#include <tbb/parallel_sort.h>

Copilot uses AI. Check for mistakes.
Comment on lines +541 to +549
/** A heap kept in thread-local storage to avoid multiple recreations of it. */
ContractorHeap heap_exemplar(8000);
ThreadData thread_data(heap_exemplar);
/** Nodes still waiting for contraction. Not all of them will be contracted though. */
tbb::concurrent_vector<NodeID> remaining_nodes;

std::size_t number_of_contracted_nodes = 0;

const unsigned int number_of_nodes = graph.GetNumberOfNodes();
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContractorHeap heap_exemplar(8000) hard-codes the backing LinearHashStorage capacity. Since the search can insert far more than 8000 distinct nodes (node_limit is based on settled nodes, while relaxations can discover many more), the fixed-size linear-probing table can fill up and then probe forever. Size this from the graph/search bounds (e.g., graph node count or a safe upper bound derived from node_limit) or use a dynamically growing storage for safety.

Suggested change
/** A heap kept in thread-local storage to avoid multiple recreations of it. */
ContractorHeap heap_exemplar(8000);
ThreadData thread_data(heap_exemplar);
/** Nodes still waiting for contraction. Not all of them will be contracted though. */
tbb::concurrent_vector<NodeID> remaining_nodes;
std::size_t number_of_contracted_nodes = 0;
const unsigned int number_of_nodes = graph.GetNumberOfNodes();
const unsigned int number_of_nodes = graph.GetNumberOfNodes();
/** A heap kept in thread-local storage to avoid multiple recreations of it. */
ContractorHeap heap_exemplar(std::max<std::size_t>(1, number_of_nodes));
ThreadData thread_data(heap_exemplar);
/** Nodes still waiting for contraction. Not all of them will be contracted though. */
tbb::concurrent_vector<NodeID> remaining_nodes;
std::size_t number_of_contracted_nodes = 0;

Copilot uses AI. Check for mistakes.
Comment thread src/contractor/graph_contractor.cpp Outdated
Comment on lines +431 to +434
// ... but found edge has smaller weight, update it.
if (edge.data.weight < current_data.weight ||
edge.data.duration < current_data.duration ||
edge.data.distance < current_data.distance)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InsertEdges treats an existing edge as a duplicate based only on direction flags and then replaces it if any of weight/duration/distance is smaller. This can select an edge with a larger routing weight just because duration or distance is smaller, changing shortest-path behavior. The duplicate-selection criteria should be based on the routing weight (and then keep the corresponding duration/distance for that same edge), rather than minimizing different metrics independently.

Suggested change
// ... but found edge has smaller weight, update it.
if (edge.data.weight < current_data.weight ||
edge.data.duration < current_data.duration ||
edge.data.distance < current_data.distance)
// ... but found edge has smaller routing weight, update it.
// Use duration and distance only as deterministic tie-breakers so
// that all metrics come from the same selected edge.
if (edge.data.weight < current_data.weight ||
(edge.data.weight == current_data.weight &&
(edge.data.duration < current_data.duration ||
(edge.data.duration == current_data.duration &&
edge.data.distance < current_data.distance))))

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +32
#define SELF_LOOPS

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#define SELF_LOOPS is unconditionally enabled, which makes the #ifndef SELF_LOOPS branches dead code and hides the intended behavior behind a compile-time switch that isn’t wired to any build option. Consider removing the macro and the conditional compilation (always support self-loops), or make it a proper build/config option if it must remain togglable.

Suggested change
#define SELF_LOOPS

Copilot uses AI. Check for mistakes.
Comment on lines 156 to +158

util::Log(logDEBUG) << "pushing RouteStep of duration "
<< from_alias<double>(segment_duration) / 10.;
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This debug log runs in the main step-assembly loop. Constructing util::Log(logDEBUG) always takes the global log mutex (see src/util/log.cpp:70), so this adds overhead even when debug logging is disabled. Please guard this behind #ifdef ENABLE_DEBUG_LOGGING / #ifndef NDEBUG, or check the log policy before constructing the logger.

Suggested change
util::Log(logDEBUG) << "pushing RouteStep of duration "
<< from_alias<double>(segment_duration) / 10.;
#if defined(ENABLE_DEBUG_LOGGING) || !defined(NDEBUG)
util::Log(logDEBUG) << "pushing RouteStep of duration "
<< from_alias<double>(segment_duration) / 10.;
#endif

Copilot uses AI. Check for mistakes.
Comment on lines +135 to +137
util::Log(logDEBUG) << "Search completed with middle node " << middle << " at weight "
<< weight;

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util::Log(logDEBUG) here executes on every CH search and always acquires the global log mutex during construction (even if debug is muted). This can cause a noticeable per-request performance hit. Please wrap this behind #ifdef ENABLE_DEBUG_LOGGING / #ifndef NDEBUG, or check the logging policy before constructing the logger.

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +166
{
auto log = util::Log(logDEBUG);
log << " and packed_leg of size " << packed_leg.size() << " with nodes";
for (auto node : packed_leg)
{
log << " " << node;
}
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block logs every node in packed_leg and constructs a util::Log(logDEBUG) unconditionally. Even with debug disabled, logger construction takes a mutex, and iterating packed_leg can be expensive. Please compile-guard this (e.g. #ifdef ENABLE_DEBUG_LOGGING / #ifndef NDEBUG) or avoid building the log message unless debug logging is actually enabled.

Copilot uses AI. Check for mistakes.
#include "engine/guidance/route_step.hpp"
#include "engine/internal_route_result.hpp"
#include "util/coordinate_calculation.hpp"
#include "util/log.hpp"
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util/log.hpp is included but not used anywhere in this header (no util::Log references). Please drop the include to avoid unnecessary rebuilds and keep header dependencies minimal.

Suggested change
#include "util/log.hpp"

Copilot uses AI. Check for mistakes.
Comment thread scripts/debug/dump_hsgr.py Outdated
@@ -0,0 +1,283 @@
"""
Contractior debugging script.
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the module docstring: “Contractior” should be “Contractor”.

Suggested change
Contractior debugging script.
Contractor debugging script.

Copilot uses AI. Check for mistakes.
Comment on lines 19 to 23
std::vector<bool> contractGraph(ContractorGraph &graph,
std::vector<bool> node_is_uncontracted,
std::vector<bool> node_is_contractable,
std::vector<EdgeWeight> node_weights,
double core_factor = 1.0);

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this API, the boolean vector is still named node_is_contractable, but the rest of the PR consistently uses “contractible” (is_contractible, node_is_contractible_, etc.). Renaming the parameter improves consistency and avoids having both spellings in the public header.

Copilot uses AI. Check for mistakes.
@MarcelloPerathoner
Copy link
Copy Markdown
Contributor Author

I did some preliminary benchmarks. The added self-loops increase the no. of edges by ~3%. That in turn increases search time by ~8%. Surprisingly the contractor now uses far less memory ~57%.

Contractor benchmark

log edges norm time (s) norm mem (MB) norm
/tmp/origin-master-germany.log 96173352 1.000 883.02 1.000 13755 1.000
/tmp/contractor-rewrite-germany.log 99390191 1.033 853.71 0.967 7836 0.570

Routed benchmark (requests routes between randomly selected big cities in Germany)

log time (ms) norm distance norm
/tmp/routed-origin-master-germany.log 2.97 1.000 323295.350 1.000
/tmp/routed-contractor-rewrite-germany.log 3.21 1.082 323295.350 1.000

@MarcelloPerathoner
Copy link
Copy Markdown
Contributor Author

These two embarassingly simple tests fail in the current implementation:

https://github.com/MarcelloPerathoner/osrm-backend/blob/c768db25513a870e4e57eee908c066bb75a6cc15/features/testbot/self_loop.feature#L6-L45

Note that there are no one-way streets involved.

To illustrate the problem, I show the contraction of the single bidirectional road segment:

With the current implementation:

dot

This contraction is clearly incorrect because it leaves no edge to exit node 1. If source and target phantoms are both on node 1, and source is downstream from target, there is no way for either forward nor backward search to leave node 1. No route can be found. (The bidirectional edge is stored on node 0 only.)

With this PR:

dot

With the added self-loop, the forward search can now leave node 1 and return to it.

@DennisOSRM
Copy link
Copy Markdown
Collaborator

Could you elaborate on the reason for this PR? I gave probably missed something, but what behavior does it correct?

@MarcelloPerathoner
Copy link
Copy Markdown
Contributor Author

The current implementation does not find routes that obviously exist. The reason is that the contractor does not always preserve the shortest path from a node back to itself.

The contractor inserts a self-loop only if the node represents a one-way street. That is insufficient, as the problem does surface most easily with one-way streets, but is not in any way limited to them.

The above mentioned test scenarios show how obvious routes are missed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants