✨ feat: Support parallel computation for NDT by Tao-Sheng · Pull Request #6376 · PointCloudLibrary/pcl

Tao-Sheng · 2025-12-10T05:56:47Z

Support parallel computation for NDT by OpenMP with different neighbor search method

registration/include/pcl/registration/impl/ndt.hpp

Use signed std::ptrdiff_t instead of std::size_t

Copilot

Pull request overview

This PR adds OpenMP-based parallel computation support to the Normal Distributions Transform (NDT) registration algorithm, along with multiple neighbor search methods to optimize performance. The changes introduce thread-level parallelization to speed up the derivative computation phase of NDT.

Key changes:

Adds a NeighborSearchMethod enum with four options: KDTREE (original), DIRECT26, DIRECT7, and DIRECT1
Implements setNumberOfThreads() method with OpenMP support for parallel execution
Parallelizes the main computation loop in computeDerivatives() using OpenMP directives
Updates test to verify functionality with the new API

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
registration/include/pcl/registration/ndt.h	Adds enum for neighbor search methods, declares thread control API and private member variables for threading configuration
registration/include/pcl/registration/impl/ndt.hpp	Implements setNumberOfThreads() with OpenMP detection, parallelizes derivative computation loop, adds switch statement for different search methods
test/registration/test_ndt.cpp	Updates test to explicitly configure search method and thread count for validation

Comments suppressed due to low confidence (1)

registration/include/pcl/registration/impl/ndt.hpp:269

The OpenMP parallel region has a critical thread safety issue. The variables score, score_gradient, and hessian are marked as shared, but multiple threads will update them concurrently (via score += on line 266 and calls to updateDerivatives which modifies score_gradient and hessian). These concurrent updates to shared variables will cause race conditions, leading to incorrect results. You need to use reduction clauses for score and critical sections or atomic operations for score_gradient and hessian, or alternatively use thread-local accumulators that are combined after the parallel region.

#pragma omp parallel for num_threads(threads_) schedule(dynamic, 32)                   \
    shared(score, score_gradient, hessian) firstprivate(neighborhood, distances)
  // Update gradient and hessian for each point, line 17 in Algorithm 2 [Magnusson 2009]
  for (std::ptrdiff_t idx = 0; idx < static_cast<std::ptrdiff_t>(input_->size());
       idx++) {
    // Transformed Point
    const auto& x_trans_pt = trans_cloud[idx];

    // Find neighbors with different search method
    switch (search_method_) {
    case KDTREE:
      // Radius search has been experimentally faster than direct neighbor checking.
      target_cells_.radiusSearch(x_trans_pt, resolution_, neighborhood, distances);
      break;
    case DIRECT26:
      target_cells_.getNeighborhoodAtPoint(x_trans_pt, neighborhood);
      break;
    case DIRECT7:
      target_cells_.getFaceNeighborsAtPoint(x_trans_pt, neighborhood);
      break;
    case DIRECT1:
      target_cells_.getVoxelAtPoint(x_trans_pt, neighborhood);
      break;
    }

    // Original Point
    const Eigen::Vector3d x = (*input_)[idx].getVector3fMap().template cast<double>();
    const Eigen::Vector3d x_trans_position =
        x_trans_pt.getVector3fMap().template cast<double>();

    for (const auto& cell : neighborhood) {
      // Denorm point, x_k' in Equations 6.12 and 6.13 [Magnusson 2009]
      const Eigen::Vector3d x_trans = x_trans_position - cell->getMean();
      // Inverse Covariance of Occupied Voxel
      // Uses precomputed covariance for speed.
      const Eigen::Matrix3d c_inv = cell->getInverseCov();

      // Compute derivative of transform function w.r.t. transform vector, J_E and H_E
      // in Equations 6.18 and 6.20 [Magnusson 2009]
      computePointDerivatives(x);
      // Update score, gradient and hessian, lines 19-21 in Algorithm 2, according to
      // Equations 6.10, 6.12 and 6.13, respectively [Magnusson 2009]
      score +=
          updateDerivatives(score_gradient, hessian, x_trans, c_inv, compute_hessian);
    }
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

registration/include/pcl/registration/ndt.h

registration/include/pcl/registration/impl/ndt.hpp

Apply copilot suggestion to fix some error

mvieth

You have to at least put a omp critical region around computePointDerivatives(x) and updateDerivatives(...) because they both use global variables.
Also maybe put NeighborSearchMethod into the NormalDistributionTransform class, to make it clear that they belong together?

✨ feat: Support parallel computation for NDT

a1ab585

Support parallel computation for NDT by OpenMP with different neighbor search method

larshg reviewed Dec 10, 2025

View reviewed changes

registration/include/pcl/registration/impl/ndt.hpp Outdated Show resolved Hide resolved

🐞 fix: OpenMP Requires Signed Indices

bb01879

Use signed std::ptrdiff_t instead of std::size_t

larshg added module: registration changelog: enhancement Meta-information for changelog generation labels Dec 11, 2025

larshg added this to the pcl-1.16.0 milestone Dec 11, 2025

larshg requested review from Copilot and mvieth December 11, 2025 06:56

Copilot started reviewing on behalf of larshg December 11, 2025 06:57 View session

Copilot AI reviewed Dec 11, 2025

View reviewed changes

registration/include/pcl/registration/ndt.h Outdated Show resolved Hide resolved

registration/include/pcl/registration/ndt.h Outdated Show resolved Hide resolved

registration/include/pcl/registration/impl/ndt.hpp Show resolved Hide resolved

🐞 fix: Apply copilot suggestion

6c0acc8

Apply copilot suggestion to fix some error

mvieth reviewed Dec 22, 2025

View reviewed changes

mvieth added the needs: more work Specify why not closed/merged yet label Feb 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

✨ feat: Support parallel computation for NDT#6376

✨ feat: Support parallel computation for NDT#6376
Tao-Sheng wants to merge 3 commits intoPointCloudLibrary:masterfrom
Tao-Sheng:feat/ndt_openmp

Tao-Sheng commented Dec 10, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mvieth left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

Tao-Sheng commented Dec 10, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mvieth left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants