Spanning tree new process and driver by cvvergara · Pull Request #3073 · pgRouting/pgrouting

cvvergara · 2026-02-19T22:45:18Z

Fixes #3075 .

Changes proposed in this pull request:

Creates and use a spanning tree process and driver

@pgRouting/admins

Summary by CodeRabbit

Release Notes

Refactor
- Reorganized spanning tree algorithms (Kruskal and Prim) with improved internal architecture
- Unified spanning tree computation drivers for consistent processing across algorithms
- Refactored graph traversal methods (DFS and BFS) with standardized root vertex handling

- Using set instead of vector for roots

…unctions

coderabbitai · 2026-02-19T22:45:40Z

Walkthrough

This PR refactors spanning tree algorithms (Prim, Kruskal, BFS, DFS, driving distance) by creating a centralized process and driver infrastructure. It moves core algorithm implementations from the functions namespace to algorithms, removes individual driver files, creates unified pgr_process_spanningTree and do_spanningTree entry points, updates public signatures to use set<int64_t> for roots, and adds enum-based algorithm selection via expanded Which enum values.

Changes

Cohort / File(s)	Summary
Spanning Tree Algorithm Implementations `include/spanningTree/kruskal.hpp`, `include/spanningTree/mst.hpp`, `include/spanningTree/prim.hpp`	Moved class definitions from functions to algorithms namespace; updated method signatures to accept `const std::set<int64_t>&` roots instead of vectors; refactored internal state management and added private helper functions.
Spanning Tree Source Files `src/spanningTree/kruskal.cpp`, `src/spanningTree/prim.cpp`, `src/spanningTree/spanningTree_driver.cpp`, `src/spanningTree/spanningTree_process.cpp`	Added new wrapper functions in functions namespace delegating to algorithms; created unified driver with enum-based algorithm selection; added centralized process handler with SQL parsing and error management.
Deleted Spanning Tree Drivers `src/spanningTree/kruskal_driver.cpp`, `src/spanningTree/prim_driver.cpp`, `src/spanningTree/mst_common.cpp`, `src/spanningTree/details.cpp`	Removed legacy driver implementations and utility functions; consolidation into new unified process/driver pattern.
Traversal Algorithm Updates `include/traversal/depthFirstSearch.hpp`, `src/traversal/depthFirstSearch.c`, `src/traversal/depthFirstSearch_driver.cpp`	Moved class from functions to algorithms namespace; updated to use `set<int64_t>` roots; integrated into unified spanning tree process; removed dedicated driver.
Breadth-First Search Updates `include/breadthFirstSearch/breadthFirstSearch.hpp`, `src/breadthFirstSearch/breadthFirstSearch.c`, `src/breadthFirstSearch/breadthFirstSearch_driver.cpp`	Updated to delegate through new spanning tree process; removed dedicated driver; added free function wrapper in functions namespace.
Driving Distance Integration `include/dijkstra/drivingDist.hpp`, `src/driving_distance/driving_distance.c`, `src/driving_distance/driving_distance_driver.cpp`	Moved function from algorithm to functions namespace; removed dedicated driver; integrated into unified spanning tree process with parameter adjustments.
Enum and Infrastructure Updates `include/c_common/enums.h`, `include/process/spanningTree_process.h`, `include/drivers/spanningTree_driver.hpp`	Added KRUSKAL, PRIM, DFS, BFS variants with explicit enum values; created new process header (renamed from prim_driver.h); updated driver header with new signatures.
Deleted Driver Headers `include/drivers/breadthFirstSearch/...`, `include/drivers/driving_distance/...`, `include/drivers/traversal/...`, `include/drivers/spanningTree/mst_common.h`	Removed public C-style driver interfaces; consolidation into unified pattern.
Type Conversion Utilities `include/cpp_common/to_postgres.hpp`, `src/cpp_common/to_postgres.cpp`	Added new overloads for `get_tuples` supporting MST_rt vector conversion and depth assignment for driving distance results.
General Utilities `include/cpp_common/utilities.hpp`, `src/cpp_common/utilities.cpp`	Added `only_root_result` function; extended `get_name` with new algorithm enum mappings; added early return guard for empty points_sql.
Build Configuration `src/spanningTree/CMakeLists.txt`, `src/breadthFirstSearch/CMakeLists.txt`, `src/driving_distance/CMakeLists.txt`, `src/traversal/CMakeLists.txt`, `tools/scripts/code_checker.sh`	Updated source lists: removed old driver files, added new implementation files; added include-order checking for new process file.
Minor Cleanup `src/allpairs/allpairs_process.cpp`, `src/astar/astar_process.cpp`, `src/dijkstra/shortestPath_process.cpp`, `src/ordering/ordering_process.cpp`	Removed redundant null-checks in error-handling blocks.
Documentation `NEWS.md`, `doc/src/release_notes.rst`, `locale/...`	Added release notes and localization entries for spanning tree process/driver enhancement.

Sequence Diagram(s)

sequenceDiagram
    participant SQL as SQL Call
    participant Process as pgr_process_spanningTree
    participant Driver as do_spanningTree (Driver)
    participant Enum as Which Enum
    participant Algo as Algorithm<br/>(Prim/Kruskal/DFS/BFS)
    participant Graph as Graph Construction

    SQL->>Process: Call with edges_sql, roots, mode
    Process->>Process: Validate inputs
    Process->>Enum: Parse fn_suffix to Which
    Enum->>Process: Return enum value
    Process->>Driver: Call do_spanningTree with Which
    Driver->>Graph: Parse edges_sql, build graph
    Driver->>Enum: Switch on Which value
    Enum->>Driver: Route to algorithm
    Driver->>Algo: Instantiate & call<br/>(e.g., Prim/Kruskal)
    Algo->>Algo: Compute spanning tree
    Algo->>Driver: Return MST_rt results
    Driver->>Process: Return computed results
    Process->>SQL: Return tuples to PostgreSQL

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Pass by reference on drivers #3033: Refactors driver APIs to use C++ reference-based outputs and ostringstream-based logging for consistent driver interface patterns.
Dag shortestpath using the shortest path process and driver #3061: Performs similar architectural refactor replacing per-algorithm process/driver code with centralized process+driver patterns.
Using to_postgres on drivers #3052: Extends to_postgres utilities with MST_rt overloads and updates drivers to use centralized get_tuples functions.

Suggested labels

Prim, Kruskal, Rewrite

Suggested reviewers

robe2
krashish8

Poem

🐰 Spanning trees now dance in one place,
Unified drivers set a steadier pace,
From Prim and Kruskal to BFS so bright,
One process to rule them all—what a delight! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 5.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main objective of the PR: creating and using a new spanning tree process and driver.
Linked Issues check	✅ Passed	The PR successfully implements all requirements from issue `#3075`: creates a new spanning tree process/driver and updates Prim, Kruskal, BFS, DFS, and driving distance functions to use it.
Out of Scope Changes check	✅ Passed	All changes align with the stated objectives. Minor cleanup in error handling (pfree redundancy removal) and locale updates are supportive refactoring directly related to the main changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

include/traversal/depthFirstSearch.hpp (1)
76-98: 🧹 Nitpick | 🔵 Trivial

Pre-existing inconsistency: DFS unconditionally emits root rows, BFS does not.

Line 84 pushes a root entry regardless of whether the vertex exists in the graph, whereas the BFS counterpart in include/breadthFirstSearch/breadthFirstSearch.hpp (lines 62-63) only emits the root row inside the has_vertex check. This is pre-existing behavior unchanged by this PR, but worth noting for future alignment.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/traversal/depthFirstSearch.hpp` around lines 76 - 98, The
depthFirstSearch function currently unconditionally pushes a root result into
results before checking graph.has_vertex, causing inconsistent behavior versus
BFS; move the results.push_back({root, ...}) so it only executes inside the
has_vertex() block (the same block where depthFirstSearch_single_vertex(...) is
called and get_results(...) is used) to ensure root rows are emitted only for
existing vertices; update references around depthFirstSearch,
depthFirstSearch_single_vertex, get_results, and the local results vector
accordingly.
include/spanningTree/prim.hpp (1)
89-93: ⚠️ Potential issue | 🟠 Major

Pgr_prim::clear() hides Pgr_mst::clear(), leaving base-class state uncleared. When generate_mst calls this->clear(), it invokes the derived class's private clear() method (due to name hiding), which only resets data, predecessors, and distances. The base class fields—m_spanning_tree, m_components, m_tree_id—are not cleared. In contrast, Pgr_kruskal::generate_mst properly clears all state via the base class Pgr_mst::clear().

While the current usage pattern (creating fresh Pgr_prim objects per call in src/spanningTree/prim.cpp) prevents practical issues, this design is fragile and inconsistent with Kruskal. If a Pgr_prim object were reused across multiple algorithm invocations, stale edges from the previous call would accumulate in m_spanning_tree.

Also note: Line 134's cost calculation (auto cost = distances[u] - distances[v]) appears semantically incorrect (should be just distances[v] to get the edge weight). However, graph.get_edge() falls back to returning the minimum-cost edge when no exact match is found, which masks the error in typical cases.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/spanningTree/prim.hpp` around lines 89 - 93, Pgr_prim::clear() hides
the base Pgr_mst::clear() and therefore leaves base fields (m_spanning_tree,
m_components, m_tree_id) uncleared; update the derived clear to call the base
clear (e.g., call Pgr_mst::clear() from Pgr_prim::clear()) or remove/rename the
derived clear and ensure generate_mst invokes Pgr_mst::clear() so base state is
reset before reuse; also fix the cost calculation in generate_mst where it
currently computes "auto cost = distances[u] - distances[v]"—replace that with
the correct edge weight expression (use distances[v] or otherwise compute the
actual edge weight) so the MST edge cost is computed correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/spanningTree/mst.hpp`:
- Around line 100-113: mstBFS (and similarly mstDFS and mstDD) take the
parameter roots by value then assign it to the member m_roots, creating an extra
copy; replace the copy assignment with a move to avoid the redundant copy by
using std::move(roots) when assigning into m_roots (i.e., change m_roots = roots
to m_roots = std::move(roots)), and apply the same change in mstDFS and mstDD to
use std::move for the by-value roots parameter.

In `@include/spanningTree/prim.hpp`:
- Around line 130-136: The edge cost is computed inverted causing get_edge(...)
to miss exact matches; in the block where u = predecessors[v] and you compute
auto cost = distances[u] - distances[v], swap the operands so cost =
distances[v] - distances[u] (i.e., compute distances[v] minus
distances[predecessors[v]]), then call graph.get_edge(u, v, cost) and insert
into this->m_spanning_tree.edges to ensure exact cost matching.

In `@src/cpp_common/to_postgres.cpp`:
- Around line 267-278: The loop that sets tuples[i].depth uses d.at(row.node)
which can throw std::out_of_range if row.node is missing and also continues
scanning all maps—change the logic in the for (const auto &d : depths) loop to
first check for row.node with d.find(row.node) (or d.count(row.node)), only
assign depth when the node is present and the earlier condition (itr->second ==
0) holds, and then break out of the depths loop once a matching map is used;
ensure tuples[i].depth is left as -1 when no match is found to preserve current
behavior.

In `@src/driving_distance/driving_distance.c`:
- Line 102: The seq column in pgr_drivingDistance is declared BIGINT but the
code sets values[0] using Int32GetDatum; change the datum constructor to
Int64GetDatum and cast the call counter to int64_t (i.e. use
Int64GetDatum((int64_t)funcctx->call_cntr + 1)) so values[0] matches the BIGINT
seq type and aligns with kruskal.c/prim.c; update the statement that assigns
values[0] in driving_distance.c accordingly.

In `@src/spanningTree/kruskal.cpp`:
- Around line 34-65: Hoist the repeated type alias "using Kruskal =
pgrouting::algorithms::Pgr_kruskal<pgrouting::UndirectedGraph>;" to the
pgrouting::functions namespace scope, and in each wrapper function (kruskal,
kruskalBFS, kruskalDFS, kruskalDD) rename the local instance currently named
"kruskal" to a non-shadowing name like "algo" or "fn" before calling
algo.kruskal(...)/algo.kruskalBFS(...)/algo.kruskalDFS(...)/algo.kruskalDD(...);
this removes duplication and avoids shadowing the kruskal function name.

In `@src/spanningTree/prim.cpp`:
- Around line 37-42: The local variable prim in the function
prim(pgrouting::UndirectedGraph &graph) shadows the function name; rename the
local (e.g., to algo or fn) where you instantiate
pgrouting::algorithms::Pgr_prim<pgrouting::UndirectedGraph> (currently "Prim
prim;") and return via that instance (prim.prim(graph)) updated accordingly;
also apply the same clearer local-name convention for the similar instances in
primBFS, primDFS, and primDD to avoid confusion between the function and local
variable names.

In `@src/spanningTree/spanningTree_driver.cpp`:
- Line 137: The variable hint is being reset to an empty string twice
unnecessarily; remove the redundant assignment(s) to hint (the second reset
after the get_edges call) so hint is only initialized once. Locate uses of hint
in spanningTree_driver.cpp around the get_edges(...) call and the subsequent
redundant assignment and delete the extra `hint = ""` statement(s), leaving the
single initial initialization intact.
- Around line 159-163: Remove the unused local variables enop, eofp,
edges_of_points, and points from spanningTree_driver.cpp (they are declared but
never referenced), and also remove the orphaned include
"withPoints/withPoints.hpp" since Point_on_edge_t is no longer used; keep the
active edges variable and ensure no other code depends on Point_on_edge_t or
that include before committing the change.
- Around line 185-245: The directed branch currently only handles DFS, BFS and
DIJKSTRADD via process<DirectedGraph> and falls through to a generic "Unknown
function on directed graph" for MST enums; update the directed switch (or its
default) to detect MST-related enums (KRUSKAL, KRUSKALBFS, KRUSKALDFS,
KRUSKALDD, PRIM, PRIMBFS, PRIMDFS, PRIMDD) and return a clear error (e.g.,
"Kruskal/Prim algorithms require an undirected graph: " + get_name(which))
instead of the generic message, or alternatively implement the proper MST
handling for directed graphs by invoking the same process/handlers used in the
undirected branch (referencing process<DirectedGraph> or
pgrouting::functions::kruskal/prim as appropriate).
- Around line 169-178: BFS handling is inconsistent when edges.empty(): it
returns early without populating return_tuples/return_count while other
algorithms call pgrouting::only_root_result(roots) and set return_count via
get_tuples; update the BFS branch to mirror others by calling auto emptyresults
= pgrouting::only_root_result(roots); then set return_count =
get_tuples(emptyresults, return_tuples) before returning (or add a clear comment
in the BFS branch explaining intentional different semantics), referencing the
BFS check, only_root_result(), get_tuples(), return_tuples, and return_count to
locate and modify the logic.

In `@src/spanningTree/spanningTree_process.cpp`:
- Around line 136-140: The outer condition already ensures (*result_tuples) is
non-null, so remove the redundant inner "if (*result_tuples)" check in
spanningTree_process.cpp and directly call pfree(*result_tuples), then set
(*result_tuples) = nullptr and (*result_count) = 0; this simplifies the block
that currently checks err.str().empty() and (*result_tuples) and avoids the
duplicate null test.
- Around line 82-83: The error message for the distance validation is
inconsistent with the check: currently code checks "if (distance < 0)" and calls
pgr_throw_error("Negative value found on 'distance'", "Must be positive");
either change the guard to "if (distance <= 0)" to enforce strictly positive
distances, or change the hint to "Must be non-negative" to allow zero; apply the
same change to all four occurrences where distance is validated (the
pgr_throw_error calls for 'distance' in spanningTree_process.cpp) so the
condition and the message remain consistent.

---

Outside diff comments:
In `@include/spanningTree/prim.hpp`:
- Around line 89-93: Pgr_prim::clear() hides the base Pgr_mst::clear() and
therefore leaves base fields (m_spanning_tree, m_components, m_tree_id)
uncleared; update the derived clear to call the base clear (e.g., call
Pgr_mst::clear() from Pgr_prim::clear()) or remove/rename the derived clear and
ensure generate_mst invokes Pgr_mst::clear() so base state is reset before
reuse; also fix the cost calculation in generate_mst where it currently computes
"auto cost = distances[u] - distances[v]"—replace that with the correct edge
weight expression (use distances[v] or otherwise compute the actual edge weight)
so the MST edge cost is computed correctly.

In `@include/traversal/depthFirstSearch.hpp`:
- Around line 76-98: The depthFirstSearch function currently unconditionally
pushes a root result into results before checking graph.has_vertex, causing
inconsistent behavior versus BFS; move the results.push_back({root, ...}) so it
only executes inside the has_vertex() block (the same block where
depthFirstSearch_single_vertex(...) is called and get_results(...) is used) to
ensure root rows are emitted only for existing vertices; update references
around depthFirstSearch, depthFirstSearch_single_vertex, get_results, and the
local results vector accordingly.

include/spanningTree/mst.hpp

include/spanningTree/prim.hpp

src/cpp_common/to_postgres.cpp

src/driving_distance/driving_distance.c

src/spanningTree/kruskal.cpp

src/spanningTree/spanningTree_driver.cpp

src/spanningTree/spanningTree_process.cpp

- Prim & Kruskal: - Not shadowing function - Reducing duplication - driver & process: cleanup and message fix - process removing redundant if - Using UInt64GetDatum on seq - Refining get_tuples - using const reference

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/spanningTree/prim.hpp`:
- Around line 95-103: Change primTree to accept the vertex descriptor type V
instead of int64_t to avoid the id↔vertex roundtrip: update the primTree
signature from primTree(const G &graph, int64_t root_vertex) to primTree(const G
&graph, V root) and remove the graph.get_V(root_vertex) call inside (use root
directly). Then update all callers (e.g., generate_mst which currently passes
graph.graph[root].id) to pass the vertex descriptor variable (root) rather than
the external id, and adjust any other sites that pass an int64_t id (lines
referenced in the review) to pass V. Ensure predecessors/distances resizing
remains unchanged.

In `@NEWS.md`:
- Line 23: The NEWS.md change was made directly but is auto-generated; instead
add the entry for issue `#3075` into the source data used by the generator (the
inputs consumed by notes2news.pl) rather than editing NEWS.md itself; locate the
notes2news.pl invocation and the repository of release-note source files
referenced by notes2news.pl and add a new note entry for "#3075: Spanning tree:
create and use a process and driver" there, then run the generator to regenerate
NEWS.md so the entry is included properly.

In `@src/driving_distance/driving_distance.c`:
- Line 199: The code uses UInt64GetDatum for the seq BIGINT column in the v3
path; change the assignment of values[0] so it uses Int64GetDatum with an
explicit cast to int64_t (mirroring the v4 path) instead of UInt64GetDatum, i.e.
replace the UInt64GetDatum(funcctx->call_cntr + 1) usage with
Int64GetDatum((int64_t)funcctx->call_cntr + 1) to correctly represent the signed
BIGINT; update the occurrence where values[0], funcctx->call_cntr, and
UInt64GetDatum are referenced in the v3 code path.

In `@src/spanningTree/spanningTree_process.cpp`:
- Around line 104-108: The DIJKSTRADD branch currently rejects negative
distances but logs the hint "Must be positive", which contradicts the check and
the other validation messages; update the error hint in the DIJKSTRADD case (in
spanningTree_process.cpp) to "Must be non negative" so the pgr_throw_error call
for the distance check matches the < 0 condition and the other validation sites.
- Around line 67-95: The code performs fragile enum arithmetic
(static_cast<Which>(static_cast<int>(which) + val)) assuming PRIM/KRUSKAL
variants are contiguous; add compile-time guards by inserting static_asserts
that verify PRIMDD == PRIM + 1, PRIMDFS == PRIM + 2, PRIMBFS == PRIM + 3 and
KRUSKALDD == KRUSKAL + 1, KRUSKALDFS == KRUSKAL + 2, KRUSKALBFS == KRUSKAL + 3
(use the actual enum variant names from Which) before this switch so any
reordering will fail to compile; alternatively replace the arithmetic with an
explicit mapping (e.g., a small switch or helper function that maps (which,
suffix) to the correct Which value) and validate values the same way in the
existing switch on val.

---

Duplicate comments:
In `@src/spanningTree/spanningTree_driver.cpp`:
- Around line 164-174: When edges.empty() the code currently returns immediately
for which == BFS, causing inconsistent behavior with other algorithms; change
the conditional so BFS follows the same path as others by calling
pgrouting::only_root_result(roots), assigning return_count =
get_tuples(emptyresults, return_tuples), and returning, instead of the early
return inside the which == BFS branch; update or remove the TODO(later) comment
(or add a tracked TODO/issue reference) to record this behavioral consistency
fix.

coderabbitai · 2026-02-21T01:30:55Z

include/spanningTree/prim.hpp

     void primTree(
             const G &graph,
-             int64_t root_vertex);
+             int64_t root_vertex) {
+         clear();

-     void generate_mst(const G &graph) override;
-
- private:
-     // Member
-     std::vector<V> predecessors;
-     std::vector<double> distances;
-     std::vector<V> data;
-     std::set<V> m_unassigned;
-};
+         predecessors.resize(graph.num_vertices());
+         distances.resize(graph.num_vertices());

+         auto v_root(graph.get_V(root_vertex));


🧹 Nitpick | 🔵 Trivial

Consider accepting V directly in primTree to avoid a redundant id↔vertex roundtrip.

generate_mst already holds the vertex descriptor root (type V), converts it to an external id via graph.graph[root].id (line 156), passes it into primTree as int64_t root_vertex (line 97), which then converts it back to V via graph.get_V(root_vertex) (line 103). Accepting V directly would eliminate the unnecessary double conversion.

♻️ Suggested simplification

void primTree( const G &graph, - int64_t root_vertex) { + V v_root) { clear(); predecessors.resize(graph.num_vertices()); distances.resize(graph.num_vertices()); - auto v_root(graph.get_V(root_vertex)); - using prim_visitor = visitors::Prim_dijkstra_visitor<V>;

And at the call site in generate_mst:

primTree( graph, - graph.graph[root].id); + root);

Also applies to: 152-156

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@include/spanningTree/prim.hpp` around lines 95 - 103, Change primTree to accept the vertex descriptor type V instead of int64_t to avoid the id↔vertex roundtrip: update the primTree signature from primTree(const G &graph, int64_t root_vertex) to primTree(const G &graph, V root) and remove the graph.get_V(root_vertex) call inside (use root directly). Then update all callers (e.g., generate_mst which currently passes graph.graph[root].id) to pass the vertex descriptor variable (root) rather than the external id, and adjust any other sites that pass an int64_t id (lines referenced in the review) to pass V. Ensure predecessors/distances resizing remains unchanged.

coderabbitai · 2026-02-21T01:30:55Z

NEWS.md

 * [#3056](https://github.com/pgRouting/pgrouting/issues/3056): edwardMoore: use the shortest_path process and driver
 * [#3060](https://github.com/pgRouting/pgrouting/issues/3060): dagShortestPath: use the shortest_path process and driver
 * [#3064](https://github.com/pgRouting/pgrouting/issues/3064): Astar: create and use a process and driver for Astar
+* [#3075](https://github.com/pgRouting/pgrouting/issues/3075): Spanning tree: create and use a process and driver


⚠️ Potential issue | 🟡 Minor

Edit NEWS.md via the generator script, not directly.

Direct edits to NEWS.md will be overwritten the next time tools/release-scripts/notes2news.pl is run. The entry for #3075 should be added to the script's source files instead.

Based on learnings from PRs #2970 and #2868: "NEWS.md is auto-generated and manual edits would be overwritten" — the correct procedure is to update the source files consumed by notes2news.pl.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@NEWS.md` at line 23, The NEWS.md change was made directly but is auto-generated; instead add the entry for issue `#3075` into the source data used by the generator (the inputs consumed by notes2news.pl) rather than editing NEWS.md itself; locate the notes2news.pl invocation and the repository of release-note source files referenced by notes2news.pl and add a new note entry for "#3075: Spanning tree: create and use a process and driver" there, then run the generator to regenerate NEWS.md so the entry is included properly.

coderabbitai · 2026-02-21T01:30:55Z

src/driving_distance/driving_distance.c

        }
-        values[0] = Int32GetDatum((int32_t)funcctx->call_cntr + 1);
+
+        values[0] = UInt64GetDatum(funcctx->call_cntr + 1);


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the actual code at the specified line cd src/driving_distance && cat -n driving_distance.c | sed -n '195,205p'

Repository: pgRouting/pgrouting

Length of output: 660

🏁 Script executed:

# Check the v4 path around line 102 cd src/driving_distance && cat -n driving_distance.c | sed -n '98,108p'

Repository: pgRouting/pgrouting

Length of output: 738

🏁 Script executed:

# Look for the SQL function definition to check the return type of seq fd -e sql . | xargs grep -l "driving_distance" | head -5

Repository: pgRouting/pgrouting

Length of output: 45

🌐 Web query:

PostgreSQL UInt64GetDatum macro version history availability

💡 Result:

UInt64GetDatum(X) (and its companion DatumGetUInt64(X)) was introduced during the PostgreSQL 9.6 development cycle (patch discussion dated 2016‑02‑01 shows these macros being added to src/include/postgres.h).[1]

Availability by major version:

PostgreSQL 9.6+: UInt64GetDatum is available in src/include/postgres.h (example from Debian’s PostgreSQL 9.6.6 source package shows the macro present).[2]

PostgreSQL ≤ 9.5: not available (extensions commonly carry a compatibility shim and explicitly note “only available in 9.6+”).[3]

Still present in current PostgreSQL “master” / supported branches (it appears in current postgres.h documentation listings).[4]

Common compatibility fallback used by extensions for pre-9.6 is effectively:

#define UInt64GetDatum(X) Int64GetDatum((int64)(X)) (matches the non-byval definition used in later versions).[2][3]

Sources:
[1] Patch thread showing DatumGetUInt64 / UInt64GetDatum being added (2016‑02‑01) https://postgrespro.com/list/thread-id/1884554
[2] PostgreSQL 9.6 src/include/postgres.h (Debian sources) showing UInt64GetDatum definition https://sources.debian.org/src/postgresql-9.6/9.6.6-0%2Bdeb9u1~bpo8%2B1/src/include/postgres.h
[3] wal2json code comment noting “only available in 9.6+” and providing a fallback macro https://gitee.com/enmotech/wal2json/blob/master/wal2json.c
[4] Current PostgreSQL source documentation index listing UInt64GetDatum() in postgres.h https://doxygen.postgresql.org/globals_u.html

🏁 Script executed:

# Check pgRouting's PostgreSQL version requirements fd -e txt -e md -e cmake . | xargs grep -i "postgres.*version\|minimum.*postgres" | head -20

Repository: pgRouting/pgrouting

Length of output: 1789

🏁 Script executed:

# Find SQL function definitions for driving_distance fd -e sql . src/driving_distance | xargs cat

Repository: pgRouting/pgrouting

Length of output: 45

🏁 Script executed:

# Check if v3 function is marked as deprecated in the code cd src/driving_distance && grep -n "deprecated\|DEPRECATED" driving_distance.c

Repository: pgRouting/pgrouting

Length of output: 350

🏁 Script executed:

# Look for version-specific handling or comments in the file cd src/driving_distance && head -50 driving_distance.c

Repository: pgRouting/pgrouting

Length of output: 1648

🏁 Script executed:

# Search for SQL function definitions find . -path "*/sql/*driving_distance*" -type f | head -10

Repository: pgRouting/pgrouting

Length of output: 268

🏁 Script executed:

# If the above doesn't find files, search more broadly for SQL function signatures fd -e sql . | xargs grep -l "driving_distance" 2>/dev/null | head -5

Repository: pgRouting/pgrouting

Length of output: 45

🏁 Script executed:

# Check the v3 SQL function definition cat sql/driving_distance/_drivingDistance.sql

Repository: pgRouting/pgrouting

Length of output: 1660

🏁 Script executed:

# Also check the main function for reference cat sql/driving_distance/drivingDistance.sql

Repository: pgRouting/pgrouting

Length of output: 3180

UInt64GetDatum is semantically incorrect for the BIGINT SQL column type.

The seq column is declared as BIGINT (signed 64-bit) in the SQL function signature. The v4 path (line 102) correctly uses Int64GetDatum((int64_t)funcctx->call_cntr + 1) for this column, while this deprecated v3 path uses UInt64GetDatum, which is semantically incorrect for signed integers. For consistency and correctness, this should match the v4 implementation.

Proposed fix

- values[0] = UInt64GetDatum(funcctx->call_cntr + 1); + values[0] = Int64GetDatum((int64_t)funcctx->call_cntr + 1);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/driving_distance/driving_distance.c` at line 199, The code uses UInt64GetDatum for the seq BIGINT column in the v3 path; change the assignment of values[0] so it uses Int64GetDatum with an explicit cast to int64_t (mirroring the v4 path) instead of UInt64GetDatum, i.e. replace the UInt64GetDatum(funcctx->call_cntr + 1) usage with Int64GetDatum((int64_t)funcctx->call_cntr + 1) to correctly represent the signed BIGINT; update the occurrence where values[0], funcctx->call_cntr, and UInt64GetDatum are referenced in the v3 code path.

coderabbitai · 2026-02-21T01:30:55Z

src/spanningTree/spanningTree_process.cpp

+    switch (which) {
+        case PRIM:
+        case KRUSKAL:
+            if (suffix == "DD") {
+                val = 1;
+            } else if (suffix == "DFS") {
+                val = 2;
+            } else if (suffix == "BFS") {
+                val = 3;
+            }
+
+            which = static_cast<Which>(static_cast<int>(which) + val);
+
+            switch (val) {
+                case 1:
+                    if (distance < 0) {
+                        pgr_throw_error("Negative value found on 'distance'", "Must be non negative");
+                    }
+                    break;
+                case 2:
+                case 3:
+                    if (max_depth < 0) {
+                        pgr_throw_error("Negative value found on 'max_depth'", "Must be non negative");
+                    }
+                    break;
+                default:
+                    break;
+            }
+            break;


🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

# Find the enum definition fd enums.h | head -5

Repository: pgRouting/pgrouting

Length of output: 87

🏁 Script executed:

# Search for the Which enum definition rg -n "enum.*Which|PRIM|KRUSKAL" --type=cpp --type=c -A 20 | head -80

Repository: pgRouting/pgrouting

Length of output: 4880

🏁 Script executed:

# Read the enums.h file to find the Which enum definition cat include/c_common/enums.h | head -100

Repository: pgRouting/pgrouting

Length of output: 1706

Fragile enum arithmetic: relies on contiguous enum layout.

Line 78 (static_cast<Which>(static_cast<int>(which) + val)) assumes PRIM (510) and its variants PRIMDD (511), PRIMDFS (512), PRIMBFS (513) are contiguous—and similarly for KRUSKAL (500) and its variants. While this currently holds in the enum definition, any reordering of these values will silently produce wrong algorithm selection. Consider adding static_assert statements to guard against future enum changes.

Option: Add static_asserts

+ static_assert(static_cast<int>(PRIMDD) == static_cast<int>(PRIM) + 1, "Enum layout assumption"); + static_assert(static_cast<int>(PRIMDFS) == static_cast<int>(PRIM) + 2, "Enum layout assumption"); + static_assert(static_cast<int>(PRIMBFS) == static_cast<int>(PRIM) + 3, "Enum layout assumption"); + static_assert(static_cast<int>(KRUSKALDD) == static_cast<int>(KRUSKAL) + 1, "Enum layout assumption"); + static_assert(static_cast<int>(KRUSKALDFS) == static_cast<int>(KRUSKAL) + 2, "Enum layout assumption"); + static_assert(static_cast<int>(KRUSKALBFS) == static_cast<int>(KRUSKAL) + 3, "Enum layout assumption"); switch (which) {

🧰 Tools

🪛 Cppcheck (2.19.0)

[style] 72-72: The function 'pgr_alloc' is never used.

(unusedFunction)

[style] 83-83: The function 'pgr_free' is never used.

(unusedFunction)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/spanningTree/spanningTree_process.cpp` around lines 67 - 95, The code performs fragile enum arithmetic (static_cast<Which>(static_cast<int>(which) + val)) assuming PRIM/KRUSKAL variants are contiguous; add compile-time guards by inserting static_asserts that verify PRIMDD == PRIM + 1, PRIMDFS == PRIM + 2, PRIMBFS == PRIM + 3 and KRUSKALDD == KRUSKAL + 1, KRUSKALDFS == KRUSKAL + 2, KRUSKALBFS == KRUSKAL + 3 (use the actual enum variant names from Which) before this switch so any reordering will fail to compile; alternatively replace the arithmetic with an explicit mapping (e.g., a small switch or helper function that maps (which, suffix) to the correct Which value) and validate values the same way in the existing switch on val.

coderabbitai · 2026-02-21T01:30:56Z

src/spanningTree/spanningTree_process.cpp

+        case DIJKSTRADD:
+            if (distance < 0) {
+                pgr_throw_error("Negative value found on 'distance'", "Must be positive");
+            }
+            break;


⚠️ Potential issue | 🟡 Minor

Inconsistent error hint: "Must be positive" vs "Must be non negative".

The < 0 check allows zero, but the hint says "Must be positive" (implying zero is invalid). The other three validation sites (lines 83, 89, 100) correctly use "Must be non negative". This one was missed.

Proposed fix

case DIJKSTRADD: if (distance < 0) { - pgr_throw_error("Negative value found on 'distance'", "Must be positive"); + pgr_throw_error("Negative value found on 'distance'", "Must be non negative"); } break;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/spanningTree/spanningTree_process.cpp` around lines 104 - 108, The DIJKSTRADD branch currently rejects negative distances but logs the hint "Must be positive", which contradicts the check and the other validation messages; update the error hint in the DIJKSTRADD case (in spanningTree_process.cpp) to "Must be non negative" so the pgr_throw_error call for the distance check matches the < 0 condition and the other validation sites.

cvvergara added 27 commits February 19, 2026 13:29

(kruskal & prim) separating definition from implementation

3d5be64

- Using set instead of vector for roots

(mst) Using set instead of vector for roots

e644035

(depthFirstSearch) Using set instead of vector for roots

2507925

(kruskal) Using set instead of vector for roots

82b7530

(prim) Using set instead of vector for roots

cab597d

(depthFirstSearch) Using set instead of vector for roots

6be89cc

(enums) preparing enumeration for spanning tree like functions

67be75b

(utilities) Preparing for spanning tree like functions

2c30a91

(drivingDistance) adjusting namespace names

59a670f

(breadthFirstSearch) adjusting namespace names

af493dd

(to postgres) adding functions to get tuples for spanning tree like f…

62a914d

…unctions

New process and driver for spanning tree like functions

d4ca7ce

(build) add the spanning tree process and driver to the build

e6e40d3

(kruskal) using the new spanning Tree process & driver

3b7392c

(prim) using the new spanning Tree process & driver

12a3645

(breadthFirstSearch) using the new spanning Tree process & driver

1f1c4b6

(depthFirstSearch) using the new spanning Tree process & driver

8256dd1

(driving_distance) using the new spanning Tree process & driver

f420f70

🧹 removing details not used anymore

70fcf5a

🧹 removing mst_common not used anymore

b047be8

🧹 removing depthFirstSearch_driver not used anymore

cab3eb4

🧹 removing kruskal_driver not used anymore

6799cf4

🧹 removing driving_distance_driver not used anymore

87cc9db

🧹 removing breadthFirstSearch_driver not used anymore

32377eb

🧹 removing prim_driver not used anymore

4629b89

(build) adjusting build because of removals

fb95422

(code_checker) ignoring include order on spanning tree process

eb0dc7e

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

cvvergara marked this pull request as draft February 20, 2026 17:04

cvvergara added 2 commits February 20, 2026 19:05

(spanningTree) Fixing issues from rabbit

1247175

- Prim & Kruskal: - Not shadowing function - Reducing duplication - driver & process: cleanup and message fix - process removing redundant if - Using UInt64GetDatum on seq - Refining get_tuples - using const reference

Updating Release notes and news

e02deb9

cvvergara added this to the Release 4.1.0 milestone Feb 21, 2026

cvvergara added Enhancement C/C++ labels Feb 21, 2026

Update locale: commit e02deb9

19f1894

cvvergara marked this pull request as ready for review February 21, 2026 01:20

(prim) fix inverted edge cost calculation

d9478fc

cvvergara requested a review from robe2 February 21, 2026 01:26

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

Uh oh!

Comments

Conversation

cvvergara commented Feb 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cvvergara commented Feb 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 19, 2026 •

edited

Loading