-
-
Notifications
You must be signed in to change notification settings - Fork 79
Description
As part of running a Stan model, we generate a range of output files; by default -stdout.txt and .csv files that contain the process output and inference output, respectively. Optionally, we can output diagnostic and profile files to track latent dynamics and profiling information, if present.
The current logic will produce a set of output files (under the default 4 chains, one process per chain) like this:
-rw-r--r-- 1 amas amas 2367 Nov 5 15:28 bernoulli-20251105152812_0-stdout.txt
-rw-r--r-- 1 amas amas 88052 Nov 5 15:28 bernoulli-20251105152812_1.csv
-rw-r--r-- 1 amas amas 2357 Nov 5 15:28 bernoulli-20251105152812_1-stdout.txt
-rw-r--r-- 1 amas amas 88050 Nov 5 15:28 bernoulli-20251105152812_2.csv
-rw-r--r-- 1 amas amas 2358 Nov 5 15:28 bernoulli-20251105152812_2-stdout.txt
-rw-r--r-- 1 amas amas 88090 Nov 5 15:28 bernoulli-20251105152812_3.csv
-rw-r--r-- 1 amas amas 2358 Nov 5 15:28 bernoulli-20251105152812_3-stdout.txt
-rw-r--r-- 1 amas amas 86919 Nov 5 15:28 bernoulli-20251105152812_4.csv
-rw-r--r-- 1 amas amas 150457 Nov 5 15:28 bernoulli-20251105152812-diagnostic_1.csv
-rw-r--r-- 1 amas amas 150444 Nov 5 15:28 bernoulli-20251105152812-diagnostic_2.csv
-rw-r--r-- 1 amas amas 150808 Nov 5 15:28 bernoulli-20251105152812-diagnostic_3.csv
-rw-r--r-- 1 amas amas 149214 Nov 5 15:28 bernoulli-20251105152812-diagnostic_4.csv
-rw-r--r-- 1 amas amas 190 Nov 5 15:28 bernoulli-20251105152812-profile_1.csv
-rw-r--r-- 1 amas amas 189 Nov 5 15:28 bernoulli-20251105152812-profile_2.csv
-rw-r--r-- 1 amas amas 190 Nov 5 15:28 bernoulli-20251105152812-profile_3.csv
-rw-r--r-- 1 amas amas 190 Nov 5 15:28 bernoulli-20251105152812-profile_4.csv
For every file except the stdout files, we use the chain_id to index the file itself, so we get this mismatch where the stdout files are indexed 0-3 and the rest are indexed 1-4. This can result in some confusion where someone would thinking that the file suffixed by _1-stdout.txt corresponds to the output _1.csv, when it's actually 0-stdout.txt.
I think we should change this naming to align, will be an easy fix.
Second point I want to raise is that we have this inconsistency in how we name these files with indexes and extra bits. For stdout, we do {idx}-stdout.txt, so the index comes first, but for profile and diagnostic, we do (diagnostic|profile)_{idx}.csv, so the index comes second. I think we should probably align these so that our output files have consistent naming.
Welcome any thoughts from others -- if we have a consensus, I'll go ahead and implement.