Skip to content

Conversation

@alexandraBara
Copy link
Collaborator

@alexandraBara alexandraBara commented Jul 17, 2025

Generate summary csv file by combining result csv files from previous node-scraper runs.
Sample run:

 node-scraper summary --search-path /home/alexbara/node-scraper --output-path /home/alexbara
  node-scraper summary --search-path /home/alexbara/node-scraper
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Log path: ./scraper_logs_therac54_2025_07_17-11_58_25_AM
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_05_00_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_41_40_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac54_2025_07_17-10_53_11_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_10_28_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_22_09_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_04_32_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_22_31_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac54_2025_07_17-10_52_49_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_03_13_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_03_41_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_05_30_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-09_19_19_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac54_2025_07_17-10_38_39_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/configs/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Reading: /home/alexbara/node-scraper/scraper_logs_therac55_2025_07_17-08_48_51_AM/errorscraper.csv
  2025-07-17 11:58:25 CDT       INFO               nodescraper | Data written to csv file: /home/alexbara/node-scraper/summary.csv

this will generate a new file home/alexbara/node-scraper/summary.csv.
Sample summary.csv file:

nodename,plugin,status,timestamp,message
therac55,StoragePlugin,OK,2025_07_17-09_05_00_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_41_40_AM,Plugin tasks completed successfully
therac54,BiosPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,CmdlinePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DimmPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DkmsPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DmesgPlugin,ERROR,2025_07_17-10_53_11_AM,Analysis error: task detected errors (22 warnings|25 errors)
therac54,KernelPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,MemoryPlugin,ERROR,2025_07_17-10_53_11_AM,Analysis error: Memory usage is more than the maximum allowed used memory! (1 errors)
therac54,OsPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,RocmPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,StoragePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,UptimePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_10_28_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_04_32_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_22_31_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_03_13_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_03_41_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_05_30_AM,Plugin tasks completed successfully
therac54,BiosPlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac54,CmdlinePlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac54,DimmPlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac54,DkmsPlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac54,DmesgPlugin,ERROR,2025_07_17-10_38_39_AM,Analysis error: task detected errors (22 warnings|25 errors)
therac54,KernelPlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac54,MemoryPlugin,ERROR,2025_07_17-10_38_39_AM,Analysis error: Memory usage is more than the maximum allowed used memory! (1 errors)
therac54,OsPlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac54,RocmPlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac54,StoragePlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac54,UptimePlugin,OK,2025_07_17-10_38_39_AM,Plugin tasks completed successfully
therac55,KernelPlugin,OK,2025_07_17-08_47_00_AM,Plugin tasks completed successfully
therac55,BiosPlugin,OK,2025_07_17-08_47_00_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-08_48_51_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-08_58_49_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-08_59_07_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_02_21_AM,Plugin tasks completed successfully
therac55,StoragePlugin,OK,2025_07_17-09_03_26_AM,Plugin tasks completed successfully
therac54,BiosPlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully
therac54,CmdlinePlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully
therac54,DimmPlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully
therac54,DkmsPlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully
therac54,DmesgPlugin,ERROR,2025_07_17-10_41_10_AM,Analysis error: task detected errors (22 warnings|25 errors)
therac54,KernelPlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully
therac54,MemoryPlugin,ERROR,2025_07_17-10_41_10_AM,Analysis error: Memory usage is more than the maximum allowed used memory! (1 errors)
therac54,OsPlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully
therac54,RocmPlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully
therac54,StoragePlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully
therac54,UptimePlugin,OK,2025_07_17-10_41_10_AM,Plugin tasks completed successfully

Copy link
Collaborator

@landrews-amd landrews-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one small suggestion

Comment on lines 164 to 169
summary_parser.add_argument(
"--summary_path",
dest="summary_path",
type=log_path_arg,
help="Path to node-scraper results. Generates summary csv file in summary.csv.",
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to have separate args for search path vs output path.

  • Search path would be the location of the result files to process
  • output path would be where the summary is written to (default as cwd)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexandraBara what do you think of this suggested approach?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good, ill update

@bargajda-amd
Copy link
Collaborator

@alexandraBara , I might have seen it already, but please remind me how the errorscraper.csv files look like. Are they created automatically?

@bargajda-amd
Copy link
Collaborator

@alexandraBara, Does status=OK mean that output of a plugin is equal to the corresponding config file entry?

@alexandraBara
Copy link
Collaborator Author

therac54,CmdlinePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DimmPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DkmsPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DmesgPlugin,ERROR,2025_07_17-10_53_11_AM,Analysis error: task detected errors (22 warnings|25 errors)
therac54,KernelPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,MemoryPlugin,ERROR,2025_07_17-10_53_11_AM,Analysis error: Memory usage is more than the maximum allowed used memory! (1 errors)
therac54,OsPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,RocmPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,StoragePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,UptimePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully

@alexandraBara , I might have seen it already, but please remind me how the errorscraper.csv files look like. Are they created automatically?

it would be the results of 1 run of node-scraper, in case of the summary file above one of those files is this:

therac54,CmdlinePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DimmPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DkmsPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,DmesgPlugin,ERROR,2025_07_17-10_53_11_AM,Analysis error: task detected errors (22 warnings|25 errors)
therac54,KernelPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,MemoryPlugin,ERROR,2025_07_17-10_53_11_AM,Analysis error: Memory usage is more than the maximum allowed used memory! (1 errors)
therac54,OsPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,RocmPlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,StoragePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully
therac54,UptimePlugin,OK,2025_07_17-10_53_11_AM,Plugin tasks completed successfully

notice its the same node, same timestamp. Also i am going to change the name from errorscraper.csv to nodescraper.csv

@alexandraBara
Copy link
Collaborator Author

@alexandraBara, Does status=OK mean that output of a plugin is equal to the corresponding config file entry?

yes, if you ran this command:

node-scraper --plugin-config myconfig.json

and the result says OK, then the data collected matches the data expected from myconfig.json.
Similarly, if you ran like this:

node-scraper run-plugins Plugin1 Plugin2

the result saying OK means the collected data passed the analysis phase. So really the meaning "OK" depends on how you ran the tool.

fieldnames = ["nodename", "plugin", "status", "timestamp", "message"]
all_rows = []

pattern = os.path.join(base_path, "**", "errorscraper.csv")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be nodescraper.csv

Copy link
Collaborator

@landrews-amd landrews-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last small suggestion, otherwise LGTM

logger.error("No data rows found in matched CSV files.")
return

if not output_path:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If output path can be None, the type hint should be updated accordingly to Optional[output_path]

@alexandraBara alexandraBara merged commit 0589a6d into development Jul 30, 2025
5 checks passed
@alexandraBara alexandraBara deleted the alex_summary branch July 30, 2025 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants