Extra details added for SMART monitoring setup#2210
Extra details added for SMART monitoring setup#2210oneswig wants to merge 1 commit intostackhpc/2025.1from
Conversation
Based on recent experiences with getting NVME and SSD monitoring setup.
There was a problem hiding this comment.
Code Review
This pull request adds valuable documentation for SMART monitoring setup, including details on handling custom Prometheus Node Exporter parameters and a new feature for monitoring Drive Writes Per Day (DWPD). The changes are clear and enhance the user's understanding. I've found one point of inconsistency in the new documentation regarding which file is updated for DWPD ratings, for which I've left a suggestion.
| This flag scans for NVME/SSD devices in the system and creates a new | ||
| file, ``dwpd-ratings.yml``, in the directory of the current environment. |
There was a problem hiding this comment.
The documentation states that running the playbook with create_dwpd_ratings=true creates a new dwpd-ratings.yml file. However, the release note releasenotes/notes/rated-dwpd-40526e85e24ef7ea.yaml suggests that it populates a new section in the stackhpc-monitoring.yml file. This is a contradiction that could confuse users. Assuming the release note is correct, the documentation should be updated to reflect that stackhpc-monitoring.yml is modified.
| This flag scans for NVME/SSD devices in the system and creates a new | |
| file, ``dwpd-ratings.yml``, in the directory of the current environment. | |
| This flag scans for NVME/SSD devices in the system and populates a new | |
| section in the ``stackhpc-monitoring.yml`` file with the discovered drive models. |
Based on recent experiences with getting NVME and SSD monitoring setup.