Skip to content

Refactor NSS and DNSS implementations for improved readability and modularity#4

Closed
eclipse0922 wants to merge 2 commits intomasterfrom
refactor_2
Closed

Refactor NSS and DNSS implementations for improved readability and modularity#4
eclipse0922 wants to merge 2 commits intomasterfrom
refactor_2

Conversation

@eclipse0922
Copy link
Owner

@eclipse0922 eclipse0922 commented Feb 8, 2026

User description

  • Rewrote legacy code into clean C++17 style.
  • Added missing method implementations and input validation.
  • Introduced options struct for configurable parameters in NSS and DNSS.
  • Implemented CUDA support for DNSS rotational feature computation.
  • Added new CMake configuration for building with or without CUDA.
  • Updated README to reflect changes and provide build instructions.
  • Added .gitignore to exclude build directories.

PR Type

Enhancement, Bug fix, Documentation


Description

  • Refactored NSS and DNSS implementations into clean C++17 with modular design and options structs.

  • Added input validation, fixed rotational-return math (radians), and replaced legacy bucket logic with efficient lazy removal.

  • Implemented optional CUDA acceleration for DNSS rotational feature computation with CPU fallback.

  • Updated README with algorithm notes, build instructions, and references; added CMake CUDA support.


Changes walkthrough 📝

Relevant files
Enhancement
NSS.cpp
Refactor NSS/ DNSS core logic                                                       

NSS.cpp

  • Replaced legacy bucket-based sampling with modular helper functions
    and options-based configuration.
  • Implemented computeCenteredAndScaledVertices,
    computeRotationalReturnValue, and sphericalBucketIndex.
  • Added CUDA integration point for DNSS rotational features with
    fallback CPU implementation.
  • Refactored DNSS::dualNormalSpaceSampling to use priority queues and
    active-mask lazy removal.
  • +545/-310
    NSS.h
    Modernize header and options                                                         

    NSS.h

  • Removed Eigen and concurrency dependencies; added minimal glm::fvec3
    fallback implementation.
  • Introduced Options structs for configurable parameters in NSS and
    DNSS.
  • Added setUseCuda, getUseCuda, and bucket index helper methods.
  • Updated method signatures to use const references and added input
    validation.
  • +162/-123
    dnss_cuda.cu
    Add CUDA implementation                                                                   

    dnss_cuda.cu

  • Implemented CUDA kernel for rotational normal and return computation.
  • Added host-side wrapper with memory management and error handling.
  • Ensured fallback to CPU path on CUDA failure.
  • +217/-0 
    Configuration changes
    CMakeLists.txt
    Add CUDA build configuration                                                         

    CMakeLists.txt

  • Added DNSS_ENABLE_CUDA option to conditionally enable CUDA support.
  • Integrated dnss_cuda.cu source and linked against CUDA::cudart.
  • Configured C++17 standard and separable compilation for CUDA.
  • +21/-0   
    pr-agent.yml
    Configure PR agent language                                                           

    .github/workflows/pr-agent.yml

  • Added CONFIG__RESPONSE_LANGUAGE: "Korean" to configure PR agent output
    language.
  • +1/-0     
    Documentation
    README.md
    Update documentation                                                                         

    README.md

  • Rewrote README to describe fixes, algorithm notes, and build
    instructions.
  • Added CPU-only and CUDA build examples.
  • Included references to DNSS paper and related works.
  • +50/-10 

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • …dularity
    
    - Rewrote legacy code into clean C++17 style.
    - Added missing method implementations and input validation.
    - Introduced options struct for configurable parameters in NSS and DNSS.
    - Implemented CUDA support for DNSS rotational feature computation.
    - Added new CMake configuration for building with or without CUDA.
    - Updated README to reflect changes and provide build instructions.
    - Added .gitignore to exclude build directories.
    @chatgpt-codex-connector
    Copy link

    You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Description updated to latest commit (d77ecfd)

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Reviewer Guide 🔍

    (Review updated until commit 5f9b116)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🏅 Score: 85
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Memory Leak on Error Paths

    In the CUDA implementation, when errors occur during memory operations (e.g., cudaMemcpy), the code frees allocated device memory but does not reset the device pointers to nullptr before returning. This can lead to double-free issues if the function is called again or if the caller attempts to free the memory. Additionally, there are multiple exit points where memory is freed, increasing the risk of inconsistencies.

    if (!isCudaSuccess(cudaMemcpy(d_vertices, vertices_flat.data(), vector_bytes, cudaMemcpyHostToDevice)) ||
    	!isCudaSuccess(cudaMemcpy(d_normals, normals_flat.data(), vector_bytes, cudaMemcpyHostToDevice)))
    {
    	cudaFree(d_vertices);
    	cudaFree(d_normals);
    	cudaFree(d_rot_normals);
    	cudaFree(d_rot_returns);
    	return false;
    }
    Missing Error Handling for Kernel Launch

    The CUDA kernel launch on line 172 does not check for launch errors (e.g., invalid configuration). While cudaGetLastError() is called after synchronization, it only captures runtime errors, not launch-time errors such as invalid grid dimensions or shared memory overflow. This could lead to silent failures where the kernel does not execute correctly.

    computeRotationalFeaturesKernel<<<blocks, threads>>>(
    	d_vertices,
    	d_normals,
    	theta_for_return_radians,
    	d_rot_normals,
    	d_rot_returns,
    	point_count);
    
    if (!isCudaSuccess(cudaGetLastError()) || !isCudaSuccess(cudaDeviceSynchronize()))
    {
    	cudaFree(d_vertices);
    	cudaFree(d_normals);
    	cudaFree(d_rot_normals);
    	cudaFree(d_rot_returns);
    	return false;
    }
    Potential Division by Zero in Rotational Return Calculation

    In computeRotationalReturnValue, the denominator variables (denominator_positive and denominator_negative) use std::max with kEpsilon, but the division by theta on lines 80-81 does not validate that theta is non-zero. If theta is zero or very close to zero, this could cause numerical instability or division by zero.

    const float rr_positive = point_norm * gamma_positive / theta;
    const float rr_negative = point_norm * gamma_negative / theta;
    const float rotational_return = std::max(rr_positive, rr_negative);

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Description updated to latest commit (5f9b116)

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    Persistent review updated to latest commit 5f9b116

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Code Suggestions ✨

    Latest suggestions up to 5f9b116

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Add check for infinite max_radius

    The computeCenteredAndScaledVertices function does not handle the case where
    max_radius is infinite, which could lead to division by infinity and undefined
    behavior. Adding a check for infinite max_radius would improve robustness.

    NSS.cpp [164]

     void computeCenteredAndScaledVertices(
     	const std::vector<glm::fvec3> &vertices,
     	std::vector<glm::fvec3> &normalized_vertices)
     {
     	normalized_vertices.clear();
     	normalized_vertices.resize(vertices.size(), glm::fvec3(0.0f));
     	if (vertices.empty())
     	{
     		return;
     	}
     
     	glm::fvec3 centroid(0.0f, 0.0f, 0.0f);
     	for (const glm::fvec3 &vertex : vertices)
     	{
     		centroid += vertex;
     	}
     	centroid /= static_cast<float>(vertices.size());
     
     	float max_radius = 0.0f;
     	for (std::size_t i = 0; i < vertices.size(); ++i)
     	{
     		normalized_vertices[i] = vertices[i] - centroid;
     		max_radius = std::max(max_radius, vectorLength(normalized_vertices[i]));
     	}
     
    -	if (max_radius <= kEpsilon)
    +	if (!std::isfinite(max_radius) || max_radius <= kEpsilon)
     	{
     		return;
     	}
     
     	const float inv_radius = 1.0f / max_radius;
     	for (glm::fvec3 &vertex : normalized_vertices)
     	{
     		vertex *= inv_radius;
     	}
     }
    Suggestion importance[1-10]: 8

    __

    Why: Checking for infinite max_radius before division prevents undefined behavior from division by infinity. This is a critical robustness improvement, especially since vectorLength could theoretically produce infinite values in edge cases with invalid input data.

    Medium
    Use atan2 for safer angle computation

    The function computeRotationalReturnValue uses std::atan with potentially zero
    denominators, which can lead to undefined behavior or NaN values. Although
    std::max(kEpsilon, ...) is used for denominators, the numerator can still be zero,
    leading to atan(0/epsilon) which is safe, but the logic should be clearer.
    Additionally, the function should ensure that the division by theta is safe by
    checking if theta is near zero.

    NSS.cpp [74-75]

     float computeRotationalReturnValue(const glm::fvec3 &normalized_point, const glm::fvec3 &normal, float theta)
     {
     	const float point_norm = vectorLength(normalized_point);
     	if (!std::isfinite(point_norm) || point_norm <= kEpsilon)
    +	{
    +		return 0.0f;
    +	}
    +
    +	if (std::abs(theta) <= kEpsilon)
     	{
     		return 0.0f;
     	}
     
     	const glm::fvec3 point_dir = normalized_point / point_norm;
     	const glm::fvec3 normal_dir = normalizeSafe(normal);
     	const float dot_pn = clampFloat(dotProduct(point_dir, normal_dir), -1.0f, 1.0f);
     	const float beta = std::acos(dot_pn);
     
     	const float pp = 2.0f * point_norm * std::sin(theta * 0.5f);
     	const float sin_beta = std::sin(beta);
     	const float cos_beta = std::cos(beta);
     
     	const float pq_positive = pp * std::cos(beta - theta * 0.5f);
     	const float pq_negative = pp * std::cos(-beta - theta * 0.5f);
     
     	const float denominator_positive = std::max(kEpsilon, point_norm - pq_positive * cos_beta);
     	const float denominator_negative = std::max(kEpsilon, point_norm - pq_negative * cos_beta);
     
    -	const float atan_positive = std::atan((pq_positive * sin_beta) / denominator_positive);
    -	const float atan_negative = std::atan((pq_negative * (-sin_beta)) / denominator_negative);
    +	const float atan_positive = std::atan2(pq_positive * sin_beta, denominator_positive);
    +	const float atan_negative = std::atan2(pq_negative * (-sin_beta), denominator_negative);
     
     	const float gamma_positive = theta - atan_positive;
     	const float gamma_negative = theta - atan_negative;
     
     	const float rr_positive = point_norm * gamma_positive / theta;
     	const float rr_negative = point_norm * gamma_negative / theta;
     	const float rotational_return = std::max(rr_positive, rr_negative);
     
     	if (!std::isfinite(rotational_return))
     	{
     		return 0.0f;
     	}
     	return std::max(0.0f, rotational_return);
     }
    Suggestion importance[1-10]: 7

    __

    Why: Using std::atan2 instead of std::atan(numerator/denominator) improves numerical stability and handles edge cases where the denominator is near zero more safely. While the current code uses std::max(kEpsilon, denominator), atan2 is the standard and more robust approach for computing angles from Cartesian coordinates.

    Medium
    General
    Add upper bound checks for bin counts

    The sphericalBucketIndex function does not validate that azimuth_bins and z_bins are
    within reasonable bounds, which could lead to integer overflow or excessive memory
    allocation if these values are extremely large. Adding an upper bound check would
    improve robustness.

    NSS.cpp [97-100]

     int sphericalBucketIndex(
     	const glm::fvec3 &raw_normal,
     	int azimuth_bins,
     	int z_bins,
     	bool fold_antipodal)
     {
    -	if (azimuth_bins <= 0 || z_bins <= 0)
    +	if (azimuth_bins <= 0 || azimuth_bins > 1024 || z_bins <= 0 || z_bins > 1024)
     	{
     		return -1;
     	}
     
     	glm::fvec3 normal = normalizeSafe(raw_normal);
     
     	if (fold_antipodal)
     	{
     		const bool flip =
     			normal.z < 0.0f ||
     			(normal.z == 0.0f && (normal.y < 0.0f || (normal.y == 0.0f && normal.x < 0.0f)));
     		if (flip)
     		{
     			normal = -normal;
     		}
     	}
     
     	float azimuth = std::atan2(normal.y, normal.x);
     	if (azimuth < 0.0f)
     	{
     		azimuth += kTwoPi;
     	}
     
     	const float z_min = fold_antipodal ? 0.0f : -1.0f;
     	const float z_normalized = clampFloat((normal.z - z_min) / (1.0f - z_min), 0.0f, 1.0f);
     
     	int azimuth_index = static_cast<int>(std::floor(azimuth / kTwoPi * static_cast<float>(azimuth_bins)));
     	int z_index = static_cast<int>(std::floor(z_normalized * static_cast<float>(z_bins)));
     
     	if (azimuth_index >= azimuth_bins)
     	{
     		azimuth_index = azimuth_bins - 1;
     	}
     	if (z_index >= z_bins)
     	{
     		z_index = z_bins - 1;
     	}
     
     	return azimuth_index * z_bins + z_index;
     }
    Suggestion importance[1-10]: 5

    __

    Why: Adding upper bounds (e.g., 1024) prevents potential integer overflow or excessive memory allocation from extremely large bin counts. However, since the PR already validates positive values and the context likely uses reasonable defaults, this is a moderate improvement rather than critical.

    Low

    Previous suggestions

    Suggestions up to commit d77ecfd
    CategorySuggestion                                                                                                                                    Impact
    General
    Use std::clamp for clarity and correctness

    The clampFloat function should use std::clamp for better readability and correctness
    in C++17.

    NSS.cpp [18-21]

     float clampFloat(float value, float min_value, float max_value)
     {
    -	return std::max(min_value, std::min(value, max_value));
    +	return std::clamp(value, min_value, max_value);
     }
    Suggestion importance[1-10]: 7

    __

    Why: Using std::clamp is more idiomatic and readable in C++17, but the existing code is functionally correct. This is a style/readability improvement rather than a correctness fix.

    Medium
    Rename function to match its purpose

    The function name computeRotationalReturnValue is misleading because it computes a
    rotational return value based on the original point, not a normalized one. Rename it
    to computeRotationalReturn and update the parameter name to point for clarity.

    NSS.cpp [51-58]

    -float computeRotationalReturnValue(const glm::fvec3 &normalized_point, const glm::fvec3 &normal, float theta)
    +float computeRotationalReturn(const glm::fvec3 &point, const glm::fvec3 &normal, float theta)
     {
    -	const float point_norm = vectorLength(normalized_point);
    +	const float point_norm = vectorLength(point);
     	if (!std::isfinite(point_norm) || point_norm <= kEpsilon)
     	{
     		return 0.0f;
     	}
    Suggestion importance[1-10]: 6

    __

    Why: Renaming improves clarity, but the suggestion only changes the function name and parameter name without updating all call sites in the diff. The PR code still uses the old name in multiple places, so the suggestion is incomplete and would break compilation.

    Low
    Possible issue
    Add validation for bucket dimensions

    The sphericalBucketIndex function should validate that azimuth_bins and z_bins are
    positive before proceeding to avoid undefined behavior in subsequent calculations.

    NSS.cpp [97-100]

    +int sphericalBucketIndex(
    +	const glm::fvec3 &raw_normal,
    +	int azimuth_bins,
    +	int z_bins,
    +	bool fold_antipodal)
    +{
    +	if (azimuth_bins <= 0 || z_bins <= 0)
    +	{
    +		return -1;
    +	}
     
    -
    Suggestion importance[1-10]: 5

    __

    Why: The validation for azimuth_bins <= 0 || z_bins <= 0 already exists in the code, so the suggestion does not introduce any new change. It merely asks to keep existing validation, which is good but offers minimal improvement.

    Low
    Add empty input validation

    The computeCenteredAndScaledVertices function should validate that the input
    vertices vector is not empty before attempting to compute the centroid and scale
    factor to avoid division by zero.

    NSS.cpp [143-148]

    +void computeCenteredAndScaledVertices(
    +	const std::vector<glm::fvec3> &vertices,
    +	std::vector<glm::fvec3> &normalized_vertices)
    +{
    +	normalized_vertices.clear();
    +	normalized_vertices.resize(vertices.size(), glm::fvec3(0.0f));
    +	if (vertices.empty())
    +	{
    +		return;
    +	}
     
    -
    Suggestion importance[1-10]: 5

    __

    Why: The function already checks for empty input and returns early, so the suggestion does not add any new validation. It's redundant with existing code.

    Low

    @eclipse0922 eclipse0922 closed this Feb 8, 2026
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    1 participant