Skip to content

Extra details added for SMART monitoring setup#2210

Open
oneswig wants to merge 1 commit intostackhpc/2025.1from
smart-docs
Open

Extra details added for SMART monitoring setup#2210
oneswig wants to merge 1 commit intostackhpc/2025.1from
smart-docs

Conversation

@oneswig
Copy link
Member

@oneswig oneswig commented Mar 12, 2026

Based on recent experiences with getting NVME and SSD monitoring setup.

Based on recent experiences with getting NVME and SSD monitoring setup.
@oneswig oneswig requested a review from technowhizz March 12, 2026 21:51
@oneswig oneswig requested a review from a team as a code owner March 12, 2026 21:51
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds valuable documentation for SMART monitoring setup, including details on handling custom Prometheus Node Exporter parameters and a new feature for monitoring Drive Writes Per Day (DWPD). The changes are clear and enhance the user's understanding. I've found one point of inconsistency in the new documentation regarding which file is updated for DWPD ratings, for which I've left a suggestion.

Comment on lines +103 to +104
This flag scans for NVME/SSD devices in the system and creates a new
file, ``dwpd-ratings.yml``, in the directory of the current environment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation states that running the playbook with create_dwpd_ratings=true creates a new dwpd-ratings.yml file. However, the release note releasenotes/notes/rated-dwpd-40526e85e24ef7ea.yaml suggests that it populates a new section in the stackhpc-monitoring.yml file. This is a contradiction that could confuse users. Assuming the release note is correct, the documentation should be updated to reflect that stackhpc-monitoring.yml is modified.

Suggested change
This flag scans for NVME/SSD devices in the system and creates a new
file, ``dwpd-ratings.yml``, in the directory of the current environment.
This flag scans for NVME/SSD devices in the system and populates a new
section in the ``stackhpc-monitoring.yml`` file with the discovered drive models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant