Skip to content

Comments

Refactor NSS and DNSS implementations for improved readability and modularity#7

Closed
eclipse0922 wants to merge 2 commits intomasterfrom
refactor_2
Closed

Refactor NSS and DNSS implementations for improved readability and modularity#7
eclipse0922 wants to merge 2 commits intomasterfrom
refactor_2

Conversation

@eclipse0922
Copy link
Owner

@eclipse0922 eclipse0922 commented Feb 8, 2026

User description

  • Rewrote legacy code into clean C++17 style.
  • Added missing method implementations and input validation.
  • Introduced options struct for configurable parameters in NSS and DNSS.
  • Implemented CUDA support for DNSS rotational feature computation.
  • Added new CMake configuration for building with or without CUDA.
  • Updated README to reflect changes and provide build instructions.
  • Added .gitignore to exclude build directories.

PR Type

Enhancement, Bug fix, Documentation


Description

  • C++17 리팩토링: 레거시 코드를 모듈화되고 가독성 있는 구조로 재작성

  • NSS/DNSS에 옵션 구조체 추가 및 CUDA 지원 구현 (rotational feature 계산)

  • 입력 유효성 검사 및 누락된 메서드(setInputCloud) 구현

  • README 및 CMakeLists.txt 업데이트로 빌드 및 사용 가이드 강화


Changes walkthrough 📝

Relevant files
Enhancement
NSS.cpp
NSS/DNSS 핵심 알고리즘 및 유틸리티 재작성                                                           

NSS.cpp

  • 네임스페이스 내 헬퍼 함수 추가 (clampFloat, dotProduct, crossProduct 등)
  • sphericalBucketIndex, computeCenteredAndScaledVertices 등 정적 유틸리티 함수 구현
  • computeRotationalReturnValue 수식을 명확하게 재작성하고 안정성 개선
  • normalSpaceSamplingdualNormalSpaceSampling 알고리즘을 레거시 병렬 코드 대신 효율적인
    버킷 기반 전략으로 교체
  • CUDA 커널 호출을 위한 DNSSComputeRotationalFeaturesCuda 선언 추가
  • +545/-310
    NSS.h
    헤더 구조 및 옵션 인터페이스 개선                                                                           

    NSS.h

  • glm::fvec3 구현을 직접 포함하여 외부 의존성 제거
  • Options 구조체 추가로 설정 가능 파라미터화 (azimuth/z bins, random_seed,
    deterministic 등)
  • DNSS::Options에 CUDA 및 rotational bucket 초기화 옵션 추가
  • 메서드 시그니처를 const 참조 및 const 메서드로 정비
  • +162/-123
    dnss_cuda.cu
    DNSS CUDA 가속 구현                                                                                   

    dnss_cuda.cu

  • rotational normal 및 return 계산 CUDA 커널 구현
  • host-device 메모리 복사 및 오류 처리 로직 포함
  • CPU fallback을 위한 bool 반환값 제공
  • +217/-0 
    Configuration changes
    CMakeLists.txt
    CUDA 빌드 옵션 및 설정 추가                                                                             

    CMakeLists.txt

  • DNSS_ENABLE_CUDA 옵션 추가로 선택적 CUDA 빌드 지원
  • CUDA 언어 활성화 및 cudart 링크 설정
  • C++17 표준 강제 적용
  • +21/-0   
    pr-agent.yml
    PR 리뷰 설정 조정                                                                                           

    .github/workflows/pr-agent.yml

    • PR 리뷰어 코드 제안 수를 3에서 5로 증가
    • PR_DESCRIPTION__PUBLISH_LABELS 설정 제거
    +2/-2     
    Documentation
    README.md
    문서화 및 빌드 가이드 강화                                                                                   

    README.md

  • 프로젝트 개요 및 주요 개선 사항 상세 설명
  • 빌드 명령어 (CPU/CUDA) 및 사용 예제 추가
  • 알고리즘 노트 및 참고 문헌 정리
  • +50/-10 

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • …dularity
    
    - Rewrote legacy code into clean C++17 style.
    - Added missing method implementations and input validation.
    - Introduced options struct for configurable parameters in NSS and DNSS.
    - Implemented CUDA support for DNSS rotational feature computation.
    - Added new CMake configuration for building with or without CUDA.
    - Updated README to reflect changes and provide build instructions.
    - Added .gitignore to exclude build directories.
    @chatgpt-codex-connector
    Copy link

    You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Description updated to latest commit (b7adea4)

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Reviewer Guide 🔍

    (Review updated until commit 4bb5106)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Memory leak in CUDA error paths

    In DNSSComputeRotationalFeaturesCuda, when CUDA memory allocation or data transfer fails, the function frees allocated device pointers before returning false. However, if cudaMalloc succeeds for some pointers but fails for later ones (e.g., d_rot_returns fails after d_vertices, d_normals, d_rot_normals are allocated), the function frees all pointers including potentially uninitialized ones, and returns false. This is correct, but the error handling logic is duplicated multiple times. More critically, if cudaMemcpy for host-to-device transfer fails, the function frees device memory and returns false, but does not clear the output vectors (rotational_normals and rotational_returns) before returning, potentially leaving them in an inconsistent state if the caller assumes they were initialized on failure.

    Potential division by zero in CUDA kernel

    In the CUDA kernel computeRotationalFeaturesKernel, when computing normal_norm, if it is zero or invalid, it is set to 1.0f. However, this normalization is applied only for the purpose of computing unit vectors ndx, ndy, ndz. If the original normal vector is zero or invalid, setting normal_norm = 1.0f arbitrarily may lead to incorrect rotational feature computation. The CPU implementation in NSS.cpp uses normalizeSafe which returns a default vector (0,0,1) for zero-length inputs, but the CUDA kernel does not handle this case consistently.

    Inconsistent bucket index handling

    In DNSS::dualNormalSpaceSampling, when computing bucket indices for translational and rotational features, if the computed bucket index is out of bounds, it is clamped to 0. However, this may cause points to be incorrectly assigned to bucket 0, potentially affecting the sampling distribution. The code should either throw an error or log a warning when bucket indices are out of bounds, as this indicates a potential bug in the bucketing logic or configuration.

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Description updated to latest commit (4bb5106)

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    Persistent review updated to latest commit 4bb5106

    @eclipse0922
    Copy link
    Owner Author

    /gemini review

    @github-actions github-actions bot removed the Other label Feb 8, 2026
    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Description updated to latest commit (4bb5106)

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    Persistent review updated to latest commit 4bb5106

    @github-actions
    Copy link

    github-actions bot commented Feb 8, 2026

    PR Code Suggestions ✨

    Latest suggestions up to 4bb5106

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Add reasonable upper bound checks

    The sphericalBucketIndex function should validate that azimuth_bins and z_bins are
    reasonable upper bounds (e.g., less than a large threshold) to prevent integer
    overflow when computing the bucket index later.

    NSS.cpp [91-100]

     int sphericalBucketIndex(
     	const glm::fvec3 &raw_normal,
     	int azimuth_bins,
     	int z_bins,
     	bool fold_antipodal)
     {
    -	if (azimuth_bins <= 0 || z_bins <= 0)
    +	if (azimuth_bins <= 0 || z_bins <= 0 || azimuth_bins > 10000 || z_bins > 10000)
     	{
     		return -1;
     	}
    Suggestion importance[1-10]: 8

    __

    Why: Adding upper bound checks prevents potential integer overflow when computing azimuth_index * z_bins + z_index. This addresses a potential correctness issue and improves robustness.

    Medium
    General
    Replace manual clamp with std::clamp

    The clampFloat function should use std::clamp for better readability and
    correctness. std::clamp is available since C++17 and handles edge cases more
    reliably than manual min/max chaining.

    NSS.cpp [18-21]

     float clampFloat(float value, float min_value, float max_value)
     {
    -	return std::max(min_value, std::min(value, max_value));
    +	return std::clamp(value, min_value, max_value);
     }
    Suggestion importance[1-10]: 7

    __

    Why: Using std::clamp improves readability and is more idiomatic in C++17. However, the existing implementation is functionally correct, so the improvement is moderate in impact.

    Medium
    Rename misleading parameter name

    The function name computeRotationalReturnValue is misleading because it receives
    normalized_point but does not actually normalize it. The parameter name should be
    point since the function expects the raw point vector, not a normalized one.

    NSS.cpp [51-57]

    -float computeRotationalReturnValue(const glm::fvec3 &normalized_point, const glm::fvec3 &normal, float theta)
    +float computeRotationalReturnValue(const glm::fvec3 &point, const glm::fvec3 &normal, float theta)
     {
    -	const float point_norm = vectorLength(normalized_point);
    +	const float point_norm = vectorLength(point);
     	if (!std::isfinite(point_norm) || point_norm <= kEpsilon)
     	{
     		return 0.0f;
     	}
    Suggestion importance[1-10]: 6

    __

    Why: Renaming normalized_point to point improves clarity since the function computes the norm itself and does not assume normalization. This is a minor readability improvement without functional impact.

    Low
    Reserve capacity before resize

    The computeCenteredAndScaledVertices function should reserve capacity before pushing
    elements to avoid unnecessary reallocations when the size is known upfront.

    NSS.cpp [139-148]

     void computeCenteredAndScaledVertices(
     	const std::vector<glm::fvec3> &vertices,
     	std::vector<glm::fvec3> &normalized_vertices)
     {
     	normalized_vertices.clear();
    -	normalized_vertices.resize(vertices.size(), glm::fvec3(0.0f));
     	if (vertices.empty())
     	{
     		return;
     	}
    +	normalized_vertices.reserve(vertices.size());
    +	normalized_vertices.resize(vertices.size(), glm::fvec3(0.0f));
    Suggestion importance[1-10]: 5

    __

    Why: While reserving capacity before resizing can avoid reallocations, calling resize immediately after reserve negates the benefit since resize will still construct elements. The change offers minimal performance gain and is not a significant improvement.

    Low

    Previous suggestions

    Suggestions up to commit 4bb5106
    CategorySuggestion                                                                                                                                    Impact
    General
    Replace nested min/max with std::clamp

    The clampFloat function should use std::clamp for better readability and correctness
    in C++17. This eliminates potential issues with nested std::min/std::max calls and
    improves maintainability.

    NSS.cpp [18-21]

     float clampFloat(float value, float min_value, float max_value)
     {
    -	return std::max(min_value, std::min(value, max_value));
    +	return std::clamp(value, min_value, max_value);
     }
    Suggestion importance[1-10]: 7

    __

    Why: Using std::clamp is more idiomatic in C++17 and improves readability, but the current implementation is functionally correct. This is a style/maintainability improvement rather than a correctness fix.

    Medium
    Rename function for clarity

    The function name computeRotationalReturnValue is misleading because it computes a
    rotational return value based on both the point and normal, not just a normalized
    point. Rename it to computeRotationalReturn for clarity and consistency with the
    algorithm description in the README.

    NSS.cpp [51-89]

    -float computeRotationalReturnValue(const glm::fvec3 &normalized_point, const glm::fvec3 &normal, float theta)
    +float computeRotationalReturn(const glm::fvec3 &point, const glm::fvec3 &normal, float theta)
     {
    -	const float point_norm = vectorLength(normalized_point);
    +	const float point_norm = vectorLength(point);
     	if (!std::isfinite(point_norm) || point_norm <= kEpsilon)
     	{
     		return 0.0f;
     	}
    Suggestion importance[1-10]: 6

    __

    Why: The suggestion correctly identifies that the function name is misleading, but the proposed change to rename it to computeRotationalReturn and change the first parameter from normalized_point to point is inconsistent with how the function is called elsewhere in the code (where it receives normalized_vertices[i]). The parameter name change would require updating all call sites, making this a more invasive change than described.

    Low
    Suggestions up to commit b7adea4
    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix clampFloat argument order

    The clampFloat function is implemented using std::max and std::min, but the order of
    arguments is incorrect. The correct order should be std::min(std::max(value,
    min_value), max_value) to ensure proper clamping.

    NSS.cpp [18-21]

     float clampFloat(float value, float min_value, float max_value)
     {
    -	return std::max(min_value, std::min(value, max_value));
    +	return std::min(std::max(value, min_value), max_value);
     }
    Suggestion importance[1-10]: 8

    __

    Why: The original clampFloat implementation uses std::max(min_value, std::min(value, max_value)), which incorrectly clamps values below min_value to min_value only when value >= min_value. The correct implementation std::min(std::max(value, min_value), max_value) ensures proper clamping for all cases, which is critical for numerical stability in rotational return calculations.

    Medium
    General
    Rename misleading function and parameter

    The function name computeRotationalReturnValue and its parameter normalized_point
    are misleading because the point is not actually normalized. This can cause
    confusion and potential misuse. Rename the parameter to point and update the
    function name to reflect that it computes rotational return for an arbitrary point.

    NSS.cpp [51-57]

    -float computeRotationalReturnValue(const glm::fvec3 &normalized_point, const glm::fvec3 &normal, float theta)
    +float computeRotationalReturnValue(const glm::fvec3 &point, const glm::fvec3 &normal, float theta)
     {
    -	const float point_norm = vectorLength(normalized_point);
    +	const float point_norm = vectorLength(point);
     	if (!std::isfinite(point_norm) || point_norm <= kEpsilon)
     	{
     		return 0.0f;
     	}
    Suggestion importance[1-10]: 5

    __

    Why: While renaming normalized_point to point improves clarity since the point is not actually normalized, the function name computeRotationalReturnValue is already descriptive and the parameter renaming alone does not address a critical bug. This is a minor readability improvement rather than a correctness fix.

    Low
    Reorder empty check for safety

    The function computeCenteredAndScaledVertices does not validate that
    normalized_vertices is not null or that vertices is not empty before proceeding. Add
    explicit validation to prevent undefined behavior.

    NSS.cpp [139-148]

     void computeCenteredAndScaledVertices(
     	const std::vector<glm::fvec3> &vertices,
     	std::vector<glm::fvec3> &normalized_vertices)
     {
    +	if (vertices.empty())
    +	{
    +		normalized_vertices.clear();
    +		return;
    +	}
     	normalized_vertices.clear();
     	normalized_vertices.resize(vertices.size(), glm::fvec3(0.0f));
    -	if (vertices.empty())
    -	{
    -		return;
    -	}
    Suggestion importance[1-10]: 4

    __

    Why: Moving the empty check before resize is a minor optimization but does not prevent undefined behavior since normalized_vertices is passed by reference and always valid. The current implementation is functionally correct, and the suggested change offers only marginal improvement in code style.

    Low
    Add bucket size validation

    The validation check for azimuth_bins and z_bins only returns -1, but doesn't
    provide any diagnostic information. Consider adding a check for fold_antipodal
    validity and potentially logging or throwing an exception for better debugging.

    NSS.cpp [91-100]

     int sphericalBucketIndex(
     	const glm::fvec3 &raw_normal,
     	int azimuth_bins,
     	int z_bins,
     	bool fold_antipodal)
     {
     	if (azimuth_bins <= 0 || z_bins <= 0)
     	{
     		return -1;
     	}
    +	if (azimuth_bins > 1024 || z_bins > 1024)
    +	{
    +		return -1;
    +	}
    Suggestion importance[1-10]: 3

    __

    Why: Adding upper bound checks for azimuth_bins and z_bins (e.g., >1024) provides minimal safety improvement since such large values are unlikely in practice. The original validation only checks for positive values, which is sufficient for the algorithm's requirements. This is a low-impact enhancement.

    Low

    Copy link

    @gemini-code-assist gemini-code-assist bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Code Review

    이 PR은 NSS 및 DNSS 구현을 현대적인 C++17 스타일로 리팩토링하여 코드 가독성, 모듈성 및 구성 가능성을 크게 향상시켰습니다. Options 구조체를 도입하고, CUDA 지원을 추가하고, 더 효율적인 알고리즘(예: lazy removal)을 사용한 점이 인상적입니다. 전반적으로 훌륭한 개선입니다.

    몇 가지 추가적인 개선을 위해 코드의 재사용성과 유지보수성을 높일 수 있는 두 가지 제안을 남겼습니다. 하나는 NSS.h의 사용자 정의 fvec3에 대한 수학 헬퍼 함수를 인라인으로 제공하는 것이고, 다른 하나는 dnss_cuda.cu에서 RAII를 사용하여 CUDA 메모리 관리를 더 안전하게 만드는 것입니다.

    Comment on lines +9 to +109
    namespace glm
    {
    struct fvec3
    {
    float x;
    float y;
    float z;

    constexpr fvec3()
    : x(0.0f), y(0.0f), z(0.0f)
    {
    }

    constexpr explicit fvec3(float value)
    : x(value), y(value), z(value)
    {
    }

    constexpr fvec3(float x_value, float y_value, float z_value)
    : x(x_value), y(y_value), z(z_value)
    {
    }

    float &operator[](int index)
    {
    return (&x)[index];
    }

    const float &operator[](int index) const
    {
    return (&x)[index];
    }

    fvec3 &operator+=(const fvec3 &rhs)
    {
    x += rhs.x;
    y += rhs.y;
    z += rhs.z;
    return *this;
    }

    fvec3 &operator-=(const fvec3 &rhs)
    {
    x -= rhs.x;
    y -= rhs.y;
    z -= rhs.z;
    return *this;
    }

    fvec3 &operator*=(float scalar)
    {
    x *= scalar;
    y *= scalar;
    z *= scalar;
    return *this;
    }

    fvec3 &operator/=(float scalar)
    {
    x /= scalar;
    y /= scalar;
    z /= scalar;
    return *this;
    }
    };

    inline fvec3 operator+(fvec3 lhs, const fvec3 &rhs)
    {
    lhs += rhs;
    return lhs;
    }

    inline fvec3 operator-(fvec3 lhs, const fvec3 &rhs)
    {
    lhs -= rhs;
    return lhs;
    }

    inline fvec3 operator-(const fvec3 &value)
    {
    return fvec3(-value.x, -value.y, -value.z);
    }

    inline fvec3 operator*(fvec3 lhs, float scalar)
    {
    lhs *= scalar;
    return lhs;
    }

    inline fvec3 operator*(float scalar, fvec3 rhs)
    {
    rhs *= scalar;
    return rhs;
    }

    inline fvec3 operator/(fvec3 lhs, float scalar)
    {
    lhs /= scalar;
    return lhs;
    }
    } // namespace glm

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    medium

    GLM에 대한 의존성을 줄이기 위해 자체 fvec3 구현을 제공하는 것은 좋은 접근입니다. 하지만 관련 수학 헬퍼 함수들(dotProduct, crossProduct, vectorLength, normalizeSafe)이 NSS.cpp의 익명 네임스페이스에 정의되어 있어 재사용이 어렵습니다.

    이 함수들을 NSS.h 헤더에 inline 함수로 fvec3 구조체와 함께 정의하는 것을 고려해 보세요. 이렇게 하면 fvec3를 사용하는 다른 파일에서도 이 함수들을 재사용할 수 있어 코드 중복을 줄이고 모듈성을 높일 수 있습니다. glm 네임스페이스 안에 함께 정의하면 실제 GLM 라이브러리와의 일관성도 높일 수 있습니다.

    Comment on lines +103 to +217
    bool DNSSComputeRotationalFeaturesCuda(
    const std::vector<glm::fvec3> &normalized_vertices,
    const std::vector<glm::fvec3> &normals,
    float theta_for_return_radians,
    std::vector<glm::fvec3> *rotational_normals,
    std::vector<float> *rotational_returns)
    {
    if (rotational_normals == nullptr || rotational_returns == nullptr)
    {
    return false;
    }
    if (normalized_vertices.size() != normals.size())
    {
    return false;
    }

    const int point_count = static_cast<int>(normalized_vertices.size());
    rotational_normals->assign(normalized_vertices.size(), glm::fvec3(0.0f));
    rotational_returns->assign(normalized_vertices.size(), 0.0f);

    if (point_count == 0)
    {
    return true;
    }

    std::vector<float> vertices_flat(static_cast<std::size_t>(point_count) * 3u);
    std::vector<float> normals_flat(static_cast<std::size_t>(point_count) * 3u);
    for (int i = 0; i < point_count; ++i)
    {
    vertices_flat[3 * i + 0] = normalized_vertices[static_cast<std::size_t>(i)].x;
    vertices_flat[3 * i + 1] = normalized_vertices[static_cast<std::size_t>(i)].y;
    vertices_flat[3 * i + 2] = normalized_vertices[static_cast<std::size_t>(i)].z;
    normals_flat[3 * i + 0] = normals[static_cast<std::size_t>(i)].x;
    normals_flat[3 * i + 1] = normals[static_cast<std::size_t>(i)].y;
    normals_flat[3 * i + 2] = normals[static_cast<std::size_t>(i)].z;
    }

    float *d_vertices = nullptr;
    float *d_normals = nullptr;
    float *d_rot_normals = nullptr;
    float *d_rot_returns = nullptr;

    const std::size_t vector_bytes = vertices_flat.size() * sizeof(float);
    const std::size_t return_bytes = rotational_returns->size() * sizeof(float);

    if (!isCudaSuccess(cudaMalloc(&d_vertices, vector_bytes)) ||
    !isCudaSuccess(cudaMalloc(&d_normals, vector_bytes)) ||
    !isCudaSuccess(cudaMalloc(&d_rot_normals, vector_bytes)) ||
    !isCudaSuccess(cudaMalloc(&d_rot_returns, return_bytes)))
    {
    cudaFree(d_vertices);
    cudaFree(d_normals);
    cudaFree(d_rot_normals);
    cudaFree(d_rot_returns);
    return false;
    }

    if (!isCudaSuccess(cudaMemcpy(d_vertices, vertices_flat.data(), vector_bytes, cudaMemcpyHostToDevice)) ||
    !isCudaSuccess(cudaMemcpy(d_normals, normals_flat.data(), vector_bytes, cudaMemcpyHostToDevice)))
    {
    cudaFree(d_vertices);
    cudaFree(d_normals);
    cudaFree(d_rot_normals);
    cudaFree(d_rot_returns);
    return false;
    }

    const int threads = 256;
    const int blocks = (point_count + threads - 1) / threads;
    computeRotationalFeaturesKernel<<<blocks, threads>>>(
    d_vertices,
    d_normals,
    theta_for_return_radians,
    d_rot_normals,
    d_rot_returns,
    point_count);

    if (!isCudaSuccess(cudaGetLastError()) || !isCudaSuccess(cudaDeviceSynchronize()))
    {
    cudaFree(d_vertices);
    cudaFree(d_normals);
    cudaFree(d_rot_normals);
    cudaFree(d_rot_returns);
    return false;
    }

    std::vector<float> rot_normals_flat(vertices_flat.size());
    std::vector<float> rot_returns_flat(rotational_returns->size());

    if (!isCudaSuccess(cudaMemcpy(rot_normals_flat.data(), d_rot_normals, vector_bytes, cudaMemcpyDeviceToHost)) ||
    !isCudaSuccess(cudaMemcpy(rot_returns_flat.data(), d_rot_returns, return_bytes, cudaMemcpyDeviceToHost)))
    {
    cudaFree(d_vertices);
    cudaFree(d_normals);
    cudaFree(d_rot_normals);
    cudaFree(d_rot_returns);
    return false;
    }

    cudaFree(d_vertices);
    cudaFree(d_normals);
    cudaFree(d_rot_normals);
    cudaFree(d_rot_returns);

    for (int i = 0; i < point_count; ++i)
    {
    (*rotational_normals)[static_cast<std::size_t>(i)] = glm::fvec3(
    rot_normals_flat[3 * i + 0],
    rot_normals_flat[3 * i + 1],
    rot_normals_flat[3 * i + 2]);
    (*rotational_returns)[static_cast<std::size_t>(i)] = rot_returns_flat[static_cast<std::size_t>(i)];
    }

    return true;
    }

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    medium

    현재 CUDA 메모리 관리는 수동으로 cudaFree를 호출하여 처리하고 있습니다. 이는 올바르게 동작하지만, 여러 에러 경로에서 cudaFree를 반복적으로 호출해야 하므로 코드가 길어지고 오류가 발생하기 쉽습니다. RAII(Resource Acquisition Is Initialization) 패턴을 사용하면 코드를 더 안전하고 간결하게 만들 수 있습니다. 예를 들어, std::unique_ptr와 커스텀 deleter를 사용하면 메모리 해제를 자동화할 수 있습니다.

    이렇게 하면 함수가 어떤 경로로든 종료될 때 (예: 조기 반환 또는 예외) 메모리가 자동으로 해제되어 코드의 안정성과 유지보수성이 향상됩니다.

    #include <memory>
    
    struct CudaDeleter {
        void operator()(void* p) const {
            cudaFree(p);
        }
    };
    
    // ...
    
    float *d_vertices_raw = nullptr;
    if (!isCudaSuccess(cudaMalloc((void**)&d_vertices_raw, vector_bytes))) { return false; }
    std::unique_ptr<float, CudaDeleter> d_vertices(d_vertices_raw);
    
    // ... 다른 포인터에 대해서도 동일하게 적용 ...

    @eclipse0922 eclipse0922 closed this Feb 8, 2026
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    1 participant