Skip to content

Conversation

@mdavis36
Copy link
Collaborator

@mdavis36 mdavis36 commented Dec 26, 2025

Summary

This PR aims to aid new users in setting up and installing nvFuser successfully. This is done by providing user a comprehensive python report (based on #5609) as early as possible in the build process:

  • Clear nvFuser library dependencies & constraints
  • Minimum version (when applicable)
  • Optional vs Required dependencies
  • Actionable output to user on why a requirement is enforced and how to rectify it.

Differences (This PR vs #5609)

The outcome of the report is determined by CMake's evaluation of the constraints we place on requirements. The report has no effect on the ability to build nvFuser. All failure logic is define by the CMake system.

The report scripts are used to aid in formatting and printing pertinent information to the user. This is done by directly referencing CMake variables in python and allowing python to handle complicated string manipulation and formatting (which CMake is really bad at...).

The contents of the help messages largely remains the same as #5609. Giving user guidance based on their build platform.

CMake Changes

  • cmake/DependencyRequirements.cmake is the single source of truth for version requirements, components and the state of OPTIONAL for each dependency.
  • Option NVFUSER_ENABLE_DEPENDENCY_REPORT is by default ON. If this is set OFF then dependencies will be evaluated as "normal" in CMake and the build configuration will exit of the first failure.
  • Each requirements logic is defined in it's own cmake/deps/handle_<name>.cmake file for some organization/clarity.

Success Case

  • CMake dependency evaluation happens silently and is written to buffer.
  • Python report is generated as early as possible.
    • On first run: CMake will always look for compilers for the LANGUAGES the project is built for first - this cant be skipped AFAIK.
    • On subsequent runs: the python report is displayed immediately (compiler information is cached).
  • CMake output is dumped to the user for detailed reporting (this is the same as when running with NVFUSER_ENABLE_DEPENDECY_REPORT=Off)
image

Failure Case (example : pybind11 version too low)

Report fails with installation instructions for users.

  • Does not FATAL_ERROR when pybind11 mismatches.
  • CMake still dumps the output evaluating ALL dependencies
  • CMake exits after reporting detailed output.
image

@github-actions
Copy link

github-actions bot commented Dec 26, 2025

Review updated until commit 55bce4e

Description

  • Refactored CMake dependency management system with comprehensive Python reporting

  • Added individual dependency handler files (handle_*.cmake) for better organization

  • Implemented JSON-based dependency status reporting with actionable user guidance

  • Created Python reporting scripts for formatting and displaying dependency information

  • Removed legacy dependency files (Dependencies.cmake, FlatBuffers.cmake) and consolidated logic

Changes walkthrough

Relevant files

PR Reviewer Guide

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 Security concerns

No major security concerns identified. The JSON export functionality should be reviewed to ensure only safe variables are exported.

⚡ Recommended focus areas for review
JSON Export Security

The export_dependency_json function exports ALL CMake variables to JSON without filtering. This could potentially expose sensitive information like paths, compiler flags, or system details. Consider implementing a whitelist of safe variables to export.

function(export_dependency_json output_file)
  # Get all CMake variables
  get_cmake_property(all_vars VARIABLES)

  # Write JSON file with flat variable dict
  file(WRITE "${output_file}" "{\n")
  file(APPEND "${output_file}" "  \"cmake_vars\": {\n")

  # Export all variables (sorted for consistency)
  list(SORT all_vars)
  list(LENGTH all_vars var_count)
  set(var_index 0)
  foreach(var ${all_vars})
    set(value "${${var}}")
    # Escape for JSON strings
    string(REPLACE "\\" "\\\\" value "${value}")
    string(REPLACE "\"" "\\\"" value "${value}")
    string(REPLACE "\n" "\\n" value "${value}")
    string(REPLACE "\t" "\\t" value "${value}")
    string(REPLACE "\r" "\\r" value "${value}")

    # Add comma if not last item
    math(EXPR var_index "${var_index} + 1")
    if(var_index LESS var_count)
      file(APPEND "${output_file}" "    \"${var}\": \"${value}\",\n")
    else()
      file(APPEND "${output_file}" "    \"${var}\": \"${value}\"\n")
    endif()
  endforeach()

  file(APPEND "${output_file}" "  }\n")
  file(APPEND "${output_file}" "}\n")
endfunction()
Error Handling Robustness

The dependency reporter should handle cases where CMake variables are missing or malformed more gracefully. Currently, accessing cmake_vars.get() could return None, but the requirement classes don't consistently handle this case.

    cmake_vars = self._load_cmake_vars(deps_path)

    self.colors = Colors()
    self.platform_info = detect_platform()

    # Create requirement objects - each class defines its own name and variable names
    self.requirements = []
    self.requirements.append(NinjaRequirement(cmake_vars))
    self.requirements.append(GitSubmodulesRequirement(cmake_vars))
    self.requirements.append(CompilerRequirement(cmake_vars))
    self.requirements.append(PythonRequirement(cmake_vars))
    self.requirements.append(CUDAToolkitRequirement(cmake_vars))
    self.requirements.append(TorchRequirement(cmake_vars))
    self.requirements.append(Pybind11Requirement(cmake_vars))
    self.requirements.append(LLVMRequirement(cmake_vars))
    self.requirements.append(NVMMHRequirement(cmake_vars))

def _load_cmake_vars(self, deps_path: Path) -> Dict:
    """Load CMake variables from JSON file"""
    try:
        with open(deps_path, "r") as f:
            data = json.load(f)
            return data.get("cmake_vars", {})
    except FileNotFoundError:
        print(f"Error: {deps_path} not found", file=sys.stderr)
        sys.exit(1)
    except Exception as e:
        print(f"Error loading dependencies: {e}", file=sys.stderr)
        sys.exit(1)
CUDA Version Parsing

The regex pattern for parsing Torch CUDA version assumes a specific format. Consider adding validation for edge cases where the version string might not match the expected pattern, as this could cause parsing failures.

string(REGEX MATCH "^([0-9]+\\.[0-9]+)" torch_cuda_major_minor "${torch_cuda_version}")

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 26, 2025

Greptile Summary

This PR introduces a comprehensive CMake dependency validation and reporting system that helps users diagnose and resolve build setup issues early. The implementation adds a Python-based reporting layer on top of CMake's dependency detection to provide clear, actionable error messages with installation instructions.

Core Changes:

  • Log capture system (LogCapture.cmake): Overrides CMake's message() function to buffer output during dependency checking, allowing Python report to display first
  • Dependency handlers (cmake/deps/handle_*.cmake): Modular handlers for each dependency (Compiler, Python, CUDA, Torch, LLVM, etc.) that perform validation and set status variables
  • JSON export (DependencyUtilities.cmake): Exports all CMake variables to JSON for Python consumption
  • Python reporting (check_dependencies.py + requirement classes): OOP-based system that formats status, detects failures, and generates platform-specific help text
  • Optional reporting mode: NVFUSER_ENABLE_DEPENDENCY_REPORT option (default ON) enables the enhanced reporting; when OFF, falls back to standard CMake errors

Key Features:

  • Early validation: Shows comprehensive report before cryptic linker errors
  • CUDA constraint validation: Validates Torch CUDA version matches system CUDA Toolkit (critical for runtime correctness)
  • Colored output with clear status indicators (✓/✗/○ for success/failure/optional)
  • Platform-aware installation instructions
  • Deferred failure: Collects all dependency issues before failing, rather than stopping at first error

Technical Quality:
The implementation is generally well-structured with good separation of concerns between CMake validation logic and Python presentation. The OOP requirement classes are clean and extensible. However, there are some edge cases in error handling and delimiter parsing that warrant attention.

Confidence Score: 4/5

  • This PR is safe to merge with minor considerations
  • The implementation is well-structured with good separation of concerns. Core validation logic is sound and edge cases are mostly handled. Previous review concerns about delimiter parsing, JSON escaping, and status handling have been addressed or acknowledged by developers as acceptable trade-offs. The CUDA constraint validation is a valuable addition that prevents runtime errors. Main considerations: (1) delimiter edge case in LogCapture is acceptable per developer, (2) JSON escaping handles common cases adequately, (3) some error paths could be more defensive but won't cause build failures in practice
  • cmake/LogCapture.cmake and cmake/DependencyUtilities.cmake deserve extra scrutiny during testing to ensure log parsing and JSON export work correctly across different platforms and edge cases

Important Files Changed

Filename Overview
cmake/LogCapture.cmake Implements message buffering via function override. Validation logic on line 77 correctly checks for missing delimiter before substring operations. Edge case: delimiter could theoretically appear in message content, but developer noted this is acceptable risk.
cmake/DependencyUtilities.cmake Core validation and JSON export logic. JSON escaping handles common cases (backslash, quotes, newlines) adequately for the use case. Version comparison could fail on malformed versions but inputs are controlled by find_package.
cmake/deps/handle_torch.cmake Validates Torch version and CUDA constraint. Lines 70-73 handle regex match failure correctly. Line 92 handles case when CUDAToolkit not found. The CUDA version mismatch detection (line 74-80) is critical for preventing runtime errors.
python/tools/check_dependencies.py Clean orchestrator that loads JSON and delegates to requirement classes. Error handling at lines 70-74 is adequate. Exit code logic (line 179) is correct - Python doesn't determine build failure, only formats output.
python/tools/prereqs/requirements/base.py Well-designed base classes with clear abstraction. _to_bool method (lines 65-71) correctly handles CMake boolean strings. Version formatting logic is consistent across SUCCESS/NOT_FOUND/INCOMPATIBLE states.
python/tools/prereqs/requirements/torch.py Implements Torch-specific validation including CUDA constraint checking. The is_failure() override (lines 111-121) correctly checks both base version and CUDA constraint. Help text provides clear installation guidance.
CMakeLists.txt Integrates dependency validation system. Lines 77-104 implement the validation flow with proper log capture and reporting. enable_language(CUDA) on line 7 happens before log capture, but developer confirmed this is acceptable as CUDA is always required.
cmake/DependencyRequirements.cmake Single source of truth for version requirements. Clear structure with version minimums and optional flags. Well-documented with comments explaining each field's purpose.

Sequence Diagram

sequenceDiagram
    participant User
    participant CMake
    participant LogCapture
    participant DepHandlers as Dependency Handlers
    participant DepUtils as DependencyUtilities
    participant Python as check_dependencies.py
    participant ReqClasses as Requirement Classes

    User->>CMake: Run cmake
    CMake->>CMake: enable_language(CUDA)
    CMake->>CMake: include DependencyRequirements.cmake
    CMake->>CMake: include DependencyUtilities.cmake
    CMake->>LogCapture: include LogCapture.cmake
    
    alt NVFUSER_ENABLE_DEPENDENCY_REPORT=ON
        CMake->>LogCapture: start_capture()
        LogCapture->>LogCapture: Override message() function
        LogCapture->>LogCapture: Set LOG_CAPTURE_MODE=TRUE
        
        loop For each dependency
            CMake->>DepHandlers: handle_<dependency>()
            DepHandlers->>DepHandlers: find_package() or custom validation
            DepHandlers->>DepUtils: set_dependency_report_status()
            DepUtils->>DepUtils: Check version compatibility
            DepUtils->>DepUtils: Set NVFUSER_REQUIREMENT_<name>_STATUS
            DepUtils->>DepUtils: Update NVFUSER_DEPENDENCIES_OK if failure
            DepHandlers-->>LogCapture: message() calls captured to buffer
        end
        
        CMake->>LogCapture: stop_capture(DEP_LOGS)
        LogCapture->>LogCapture: Set LOG_CAPTURE_MODE=FALSE
        LogCapture-->>CMake: Return captured logs
        
        CMake->>DepUtils: report_dependencies()
        DepUtils->>DepUtils: export_dependency_json(nvfuser_dependencies.json)
        DepUtils->>DepUtils: Write all CMake variables to JSON
        DepUtils->>Python: execute_process(check_dependencies.py)
        
        Python->>Python: Load JSON file
        Python->>ReqClasses: Create requirement objects for each dependency
        ReqClasses->>ReqClasses: Parse CMake variables (status, version, found, optional)
        Python->>Python: generate_report()
        Python->>ReqClasses: format_status_line() for each requirement
        ReqClasses-->>Python: Return colored status lines
        Python->>User: Print formatted dependency report
        
        alt Any failures detected
            Python->>ReqClasses: generate_help() for failed requirements
            ReqClasses-->>Python: Return installation instructions
            Python->>User: Print help section with instructions
        end
        
        Python-->>DepUtils: Exit with status code
        DepUtils->>User: Display Python output
        
        CMake->>LogCapture: dump_captured_logs(DEP_LOGS)
        LogCapture->>LogCapture: Parse buffer entries (TYPE<<<DELIM>>>CONTENT)
        LogCapture->>User: Print detailed CMake output
        
        alt NVFUSER_DEPENDENCIES_OK=FALSE
            CMake->>User: FATAL_ERROR with failure message
        end
    else NVFUSER_ENABLE_DEPENDENCY_REPORT=OFF
        loop For each dependency
            CMake->>DepHandlers: handle_<dependency>()
            DepHandlers->>DepHandlers: find_package(REQUIRED) - fails immediately
        end
    end
Loading

@mdavis36
Copy link
Collaborator Author

!test

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

35 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Copy link
Collaborator

@wujingyue wujingyue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

INCOMPATIBLE = "INCOMPATIBLE"


class Requirement(ABC):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's ABC used here for? I remember using it a lot in Python 2.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstract Base Class module for python. The details are unfortunately lost on me, it looks like there is something called Protocols now in python but I'm more familiar with this pattern

python/utils.py Outdated
"-B",
cmake_build_dir,
]
if config.cutlass_max_jobs > 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 453 is the reason I prefer always -D to conditional -D.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR now builds CUTLASS by default, so once this hits main everyone will be building the CUTLASS lib as a part of their workflow (unless they explicitly disable it). I set the default from 0 to 2 in CMake (per the recommendation in the comments) as I started running out of memory. I don't think everyone want to set the max job flag for cutlass on every build. --- We can go back to ALWAYS passing this option here if we'd like, but we then need to maintain the default in 2 places.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to how the report interacts with CMake for this PR, I am of the belief that CMake should generally be the source of truth for builds (since CMake is responsible for orchestrating compilation) and that the python setup process should just be an interface to CMake from python for overriding defaults. Logic that is performed by the python setup process should focus more on python build/installation related tasks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set the default from 0 to 2 in CMake (per the recommendation in the comments) as I started running out of memory.

SGTM!

I am of the belief that CMake should generally be the source of truth for builds (since CMake is responsible for orchestrating compilation)

Makes sense.

we then need to maintain the default in 2 places.

Yes, that's bad.

With this PR, I'm seeing two different default values already: config.cutlass_max_jobs=0 and CUTLASS_MAX_JOBS=2. This is confusing.

What do you think about the following contract? All fields in BuildConfig are None by default. This function (def cmake) only feed configs that are not None to the cmake command. Other configs will stay as they are in CMakeCache.txt.

(Doesn't have to be in this PR of course.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'd like to come back to this specifically at some point and enforce the defaults purely in CMake where we can. I think there needs to be some more standardization/ consistency between environment and build option naming too. (tagging w/ an issue)

The default here of 0 is a little clunky, I'll update just this option to use a None check for this PR.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

35 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link
Collaborator

@xwang233 xwang233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Once the internal CI builds pass, we're ready to go. Thanks for working on this.

@mdavis36
Copy link
Collaborator Author

!test

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

35 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

mdavis36 and others added 3 commits January 14, 2026 14:43
Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

35 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

36 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

36 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@mdavis36
Copy link
Collaborator Author

!build

@mdavis36
Copy link
Collaborator Author

!build

@mdavis36 mdavis36 merged commit f086edc into main Jan 15, 2026
14 of 15 checks passed
@mdavis36 mdavis36 deleted the md/cmake-deps-5609-integration branch January 15, 2026 17:39

if os_type == "Linux":
if platform_info.get("ubuntu_based"):
print(" # Ubuntu/Debian (LLVM APT repository):")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you not trust apt install clang-<version>? cc @mdavis36

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was brought over from 5609. I typically do, this could probably be expanded to mention at least trying it. However, this is the recommended way from llvm and would additionally allow someone on an older debian version to install a very modern llvm version without having to mess with updating repositories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants