Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 35 additions & 28 deletions cmdstanpy/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -299,12 +299,10 @@ def optimize(
or to a temporary directory which is deleted upon session exit.

Output files are either written to a temporary directory or to the
specified output directory. Output filenames correspond to the template
'<model_name>-<YYYYMMDDHHMM>-<chain_id>' plus the file suffix which is
either '.csv' for the CmdStan output or '.txt' for
the console messages, e.g. 'bernoulli-201912081451-1.csv'.
Output files written to the temporary directory contain an additional
8-character random string, e.g. 'bernoulli-201912081451-1-5nm6as7u.csv'.
specified output directory. Optimize output filenames correspond to
the template '<model_name>-<YYYYMMDDHHMM>' plus the file suffix which is
either '.csv' for the CmdStan output or '_stdout.txt' for
the console messages, e.g. 'bernoulli-20251107142835.csv'.

:param data: Values for all data variables in the model, specified
either as a dictionary with entries matching the data variables,
Expand Down Expand Up @@ -339,7 +337,7 @@ def optimize(

:param save_profile: Whether or not to profile auto-diff operations in
labelled blocks of code. If ``True``, CSV outputs are written to
file '<model_name>-<YYYYMMDDHHMM>-profile-<chain_id>'.
file '<model_name>-<YYYYMMDDHHMM>_profile.csv'.
Introduced in CmdStan-2.26.

:param algorithm: Algorithm to use. One of: 'BFGS', 'LBFGS', 'Newton'
Expand Down Expand Up @@ -514,11 +512,15 @@ def sample(

Output files are either written to a temporary directory or to the
specified output directory. Ouput filenames correspond to the template
'<model_name>-<YYYYMMDDHHMM>-<chain_id>' plus the file suffix which is
either '.csv' for the CmdStan output or '.txt' for
the console messages, e.g. 'bernoulli-201912081451-1.csv'.
Output files written to the temporary directory contain an additional
8-character random string, e.g. 'bernoulli-201912081451-1-5nm6as7u.csv'.
'<model_name>-<YYYYMMDDHHMM>' plus additional bits to identify which
output file it corresponds to. CmdStan output will suffix with
'_<chain_id>.csv' if there is more than one chain, and simply'.csv'
in the single-chain case. For example, 'bernoulli-20251107144515_1.csv'.
Console message output is written to a text file suffixed
`_stdout_<chain_id>.txt` if each chain executes in a separate process
(default behavior) or simply `_stdout.txt` if done so in a single
process, such as when STAN_THREADS is enabled and you are sampling
more than one chain.

:param data: Values for all data variables in the model, specified
either as a dictionary with entries matching the data variables,
Expand Down Expand Up @@ -651,14 +653,17 @@ def sample(
:param save_latent_dynamics: Whether or not to output the position and
momentum information for the model parameters (unconstrained).
If ``True``, CSV outputs are written to an output file
'<model_name>-<YYYYMMDDHHMM>-diagnostic-<chain_id>',
e.g. 'bernoulli-201912081451-diagnostic-1.csv', see
'<model_name>-<YYYYMMDDHHMM>_diagnostic_<chain_id>',
e.g. 'bernoulli-201912081451_diagnostic_1.csv', see
https://mc-stan.org/docs/cmdstan-guide/stan_csv.html,
section "Diagnostic CSV output file" for details.

:param save_profile: Whether or not to profile auto-diff operations in
labelled blocks of code. If ``True``, CSV outputs are written to
file '<model_name>-<YYYYMMDDHHMM>-profile-<chain_id>'.
file '<model_name>-<YYYYMMDDHHMM>_profile_<chain_id>.csv' if each
chain runs in its own process, otherwise
'<model_name>-<YYYYMMDDHHMM>_profile.csv' if all chains run in a
single process.
Introduced in CmdStan-2.26, see
https://mc-stan.org/docs/cmdstan-guide/stan_csv.html,
section "Profiling CSV output file" for details.
Expand Down Expand Up @@ -983,12 +988,16 @@ def generate_quantities(
or to a temporary directory which is deleted upon session exit.

Output files are either written to a temporary directory or to the
specified output directory. Output filenames correspond to the template
'<model_name>-<YYYYMMDDHHMM>-<chain_id>' plus the file suffix which is
either '.csv' for the CmdStan output or '.txt' for
the console messages, e.g. 'bernoulli-201912081451-1.csv'.
Output files written to the temporary directory contain an additional
8-character random string, e.g. 'bernoulli-201912081451-1-5nm6as7u.csv'.
specified output directory. Ouput filenames correspond to the template
'<model_name>-<YYYYMMDDHHMM>' plus additional bits to identify which
output file it corresponds to. CmdStan output will suffix with
'_<chain_id>.csv' if there is more than one chain, and simply'.csv'
in the single-chain case. For example, 'bernoulli-20251107144515_1.csv'.
Console message output is written to a text file suffixed
`_stdout_<chain_id>.txt` if each chain executes in a separate process
(default behavior) or simply `_stdout.txt` if done so in a single
process, such as when STAN_THREADS is enabled and you are sampling
more than one chain.

:param data: Values for all data variables in the model, specified
either as a dictionary with entries matching the data variables,
Expand Down Expand Up @@ -1175,11 +1184,9 @@ def variational(

Output files are either written to a temporary directory or to the
specified output directory. Output filenames correspond to the template
'<model_name>-<YYYYMMDDHHMM>-<chain_id>' plus the file suffix which is
either '.csv' for the CmdStan output or '.txt' for
the console messages, e.g. 'bernoulli-201912081451-1.csv'.
Output files written to the temporary directory contain an additional
8-character random string, e.g. 'bernoulli-201912081451-1-5nm6as7u.csv'.
'<model_name>-<YYYYMMDDHHMM>' plus the file suffix which is
either '.csv' for the CmdStan output or '_stdout.txt' for
the console messages, e.g. 'bernoulli-201912081451.csv'.

:param data: Values for all data variables in the model, specified
either as a dictionary with entries matching the data variables,
Expand Down Expand Up @@ -1458,7 +1465,7 @@ def pathfinder(

:param save_profile: Whether or not to profile auto-diff operations in
labelled blocks of code. If ``True``, CSV outputs are written to
file '<model_name>-<YYYYMMDDHHMM>-profile-<path_id>'.
file '<model_name>-<YYYYMMDDHHMM>_profile.csv'.
Introduced in CmdStan-2.26, see
https://mc-stan.org/docs/cmdstan-guide/stan_csv.html,
section "Profiling CSV output file" for details.
Expand Down Expand Up @@ -1706,7 +1713,7 @@ def laplace_sample(

:param save_profile: Whether or not to profile auto-diff operations in
labelled blocks of code. If ``True``, CSV outputs are written to
file '<model_name>-<YYYYMMDDHHMM>-profile-<path_id>'.
file '<model_name>-<YYYYMMDDHHMM>_profile.csv'.
Introduced in CmdStan-2.26, see
https://mc-stan.org/docs/cmdstan-guide/stan_csv.html,
section "Profiling CSV output file" for details.
Expand Down
5 changes: 1 addition & 4 deletions cmdstanpy/stanfit/gq.py
Original file line number Diff line number Diff line change
Expand Up @@ -705,10 +705,7 @@ def _previous_draws_pd(

def save_csvfiles(self, dir: str | None = None) -> None:
"""
Move output CSV files to specified directory. If files were
written to the temporary session directory, clean filename.
E.g., save 'bernoulli-201912081451-1-5nm6as7u.csv' as
'bernoulli-201912081451-1.csv'.
Move output CSV files to specified directory.

:param dir: directory path

Expand Down
5 changes: 1 addition & 4 deletions cmdstanpy/stanfit/laplace.py
Original file line number Diff line number Diff line change
Expand Up @@ -310,10 +310,7 @@ def column_names(self) -> tuple[str, ...]:

def save_csvfiles(self, dir: str | None = None) -> None:
"""
Move output CSV files to specified directory. If files were
written to the temporary session directory, clean filename.
E.g., save 'bernoulli-201912081451-1-5nm6as7u.csv' as
'bernoulli-201912081451-1.csv'.
Move output CSV files to specified directory.

:param dir: directory path

Expand Down
5 changes: 1 addition & 4 deletions cmdstanpy/stanfit/mcmc.py
Original file line number Diff line number Diff line change
Expand Up @@ -824,10 +824,7 @@ def method_variables(self) -> dict[str, np.ndarray]:

def save_csvfiles(self, dir: str | None = None) -> None:
"""
Move output CSV files to specified directory. If files were
written to the temporary session directory, clean filename.
E.g., save 'bernoulli-201912081451-1-5nm6as7u.csv' as
'bernoulli-201912081451-1.csv'.
Move output CSV files to specified directory.

:param dir: directory path

Expand Down
5 changes: 1 addition & 4 deletions cmdstanpy/stanfit/mle.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,10 +295,7 @@ def stan_variables(

def save_csvfiles(self, dir: str | None = None) -> None:
"""
Move output CSV files to specified directory. If files were
written to the temporary session directory, clean filename.
E.g., save 'bernoulli-201912081451-1-5nm6as7u.csv' as
'bernoulli-201912081451-1.csv'.
Move output CSV files to specified directory.

:param dir: directory path

Expand Down
5 changes: 1 addition & 4 deletions cmdstanpy/stanfit/pathfinder.py
Original file line number Diff line number Diff line change
Expand Up @@ -216,10 +216,7 @@ def is_resampled(self) -> bool:

def save_csvfiles(self, dir: str | None = None) -> None:
"""
Move output CSV files to specified directory. If files were
written to the temporary session directory, clean filename.
E.g., save 'bernoulli-201912081451-1-5nm6as7u.csv' as
'bernoulli-201912081451-1.csv'.
Move output CSV files to specified directory.

:param dir: directory path

Expand Down
97 changes: 46 additions & 51 deletions cmdstanpy/stanfit/runset.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,62 +38,59 @@ def __init__(
self._args = args
self._chains = chains
self._one_process_per_chain = one_process_per_chain
if one_process_per_chain:
self._num_procs = chains
else:
self._num_procs = 1
self._num_procs = chains if one_process_per_chain else 1
self._retcodes = [-1 for _ in range(self._num_procs)]
self._timeout_flags = [False for _ in range(self._num_procs)]
if chain_ids is None:
chain_ids = [i + 1 for i in range(chains)]
self._chain_ids = chain_ids

if args.output_dir is not None:
self._output_dir = args.output_dir
else:
# make a per-run subdirectory of our master temp directory
self._output_dir = tempfile.mkdtemp(
prefix=args.model_name, dir=_TMPDIR
)
self._outdir = args.output_dir
else: # make a per-run subdirectory of our master temp directory
self._outdir = tempfile.mkdtemp(prefix=args.model_name, dir=_TMPDIR)

# output files prefix: ``<model_name>-<YYYYMMDDHHMM>_<chain_id>``
self._base_outfile = (
f'{args.model_name}-{datetime.now().strftime(time_fmt)}'
)
# per-process outputs
self._stdout_files = [''] * self._num_procs
self._profile_files = [''] * self._num_procs # optional
if one_process_per_chain:
for i in range(chains):
self._stdout_files[i] = self.file_path("-stdout.txt", id=i)
if args.save_profile:
self._profile_files[i] = self.file_path(
".csv", extra="-profile", id=chain_ids[i]
)
self._stdout_files, self._profile_files = [], []
self._csv_files, self._diagnostic_files = [], []

# per-process output files
if one_process_per_chain and chains > 1:
self._stdout_files = [
self.gen_file_name(".txt", extra="stdout", id=id)
for id in self._chain_ids
]
if args.save_profile:
self._profile_files = [
self.gen_file_name(".csv", extra="profile", id=id)
for id in self._chain_ids
]
else:
self._stdout_files[0] = self.file_path("-stdout.txt")
self._stdout_files = [self.gen_file_name(".txt", extra="stdout")]
if args.save_profile:
self._profile_files[0] = self.file_path(
".csv", extra="-profile"
)
self._profile_files = [
self.gen_file_name(".csv", extra="profile")
]

# per-chain output files
self._csv_files: list[str] = [''] * chains
self._diagnostic_files = [''] * chains # optional

if chains == 1:
self._csv_files[0] = self.file_path(".csv")
self._csv_files = [self.gen_file_name(".csv")]
if args.save_latent_dynamics:
self._diagnostic_files[0] = self.file_path(
".csv", extra="-diagnostic"
)
self._diagnostic_files = [
self.gen_file_name(".csv", extra="diagnostic")
]
else:
for i in range(chains):
self._csv_files[i] = self.file_path(".csv", id=chain_ids[i])
if args.save_latent_dynamics:
self._diagnostic_files[i] = self.file_path(
".csv", extra="-diagnostic", id=chain_ids[i]
)
self._csv_files = [
self.gen_file_name(".csv", id=id) for id in self._chain_ids
]
if args.save_latent_dynamics:
self._diagnostic_files = [
self.gen_file_name(".csv", extra="diagnostic", id=id)
for id in self._chain_ids
]

def __repr__(self) -> str:
repr = 'RunSet: chains={}, chain_ids={}, num_processes={}'.format(
Expand Down Expand Up @@ -173,14 +170,14 @@ def cmd(self, idx: int) -> list[str]:
else:
return self._args.compose_command(
idx,
csv_file=self.file_path('.csv'),
csv_file=self.gen_file_name('.csv'),
diagnostic_file=(
self.file_path(".csv", extra="-diagnostic")
self.gen_file_name(".csv", extra="diagnostic")
if self._args.save_latent_dynamics
else None
),
profile_file=(
self.file_path(".csv", extra="-profile")
self.gen_file_name(".csv", extra="profile")
if self._args.save_profile
else None
),
Expand All @@ -201,10 +198,7 @@ def stdout_files(self) -> list[str]:

def _check_retcodes(self) -> bool:
"""Returns ``True`` when all chains have retcode 0."""
for code in self._retcodes:
if code != 0:
return False
return True
return all(retcode == 0 for retcode in self._retcodes)

@property
def diagnostic_files(self) -> list[str]:
Expand All @@ -216,16 +210,17 @@ def profile_files(self) -> list[str]:
"""List of paths to CmdStan profiler files."""
return self._profile_files

# pylint: disable=invalid-name
def file_path(
def gen_file_name(
self, suffix: str, *, extra: str = "", id: int | None = None
) -> str:
"""Generate a standard file name according to CmdStan output pattern"""
file = self._base_outfile
if extra:
file += f"_{extra}"
if id is not None:
suffix = f"_{id}{suffix}"
file = os.path.join(
self._output_dir, f"{self._base_outfile}{extra}{suffix}"
)
return file
file += f"_{id}"
file += suffix
return os.path.join(self._outdir, file)

def _retcode(self, idx: int) -> int:
"""Get retcode for process[idx]."""
Expand Down
5 changes: 1 addition & 4 deletions cmdstanpy/stanfit/vb.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,10 +249,7 @@ def variational_sample_pd(self) -> pd.DataFrame:

def save_csvfiles(self, dir: str | None = None) -> None:
"""
Move output CSV files to specified directory. If files were
written to the temporary session directory, clean filename.
E.g., save 'bernoulli-201912081451-1-5nm6as7u.csv' as
'bernoulli-201912081451-1.csv'.
Move output CSV files to specified directory.

:param dir: directory path

Expand Down
7 changes: 4 additions & 3 deletions cmdstanpy_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@
"CmdStanPy will use the following optional packages, if installed:\n",
"\n",
"* `xarray`, an n-dimension labeled dataset package which can be used for outputs\n",
"* `polars`, a highly-optimized data manipulation library, which can speed up processing outputs of large Stan models\n",
"\n",
"To install CmdStanPy with all the optional packages:\n",
"\n",
Expand Down Expand Up @@ -408,7 +409,7 @@
"hash": "d31ce8e45781476cfd394e192e0962028add96ff436d4fd4e560a347d206b9cb"
},
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -422,9 +423,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
"version": "3.10.19"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
6 changes: 3 additions & 3 deletions docsrc/users-guide/outputs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ CSV File Outputs

Underlyingly, the CmdStan outputs are a set of per-chain
`Stan CSV files <https://mc-stan.org/docs/cmdstan-guide/stan_csv_apdx.html#mcmc-sampler-csv-output>`__.
The filenames follow the template '<model_name>-<YYYYMMDDHHMMSS>-<chain_id>'
plus the file suffix '.csv'.
CmdStanPy also captures the per-chain console and error messages.
The filenames follow the template '<model_name>-<YYYYMMDDHHMMSS>_<chain_id>'
plus the file suffix '.csv'. CmdStanPy also captures the per-chain console and
error messages.

.. ipython:: python

Expand Down
2 changes: 1 addition & 1 deletion test/test_generate_quantities.py
Original file line number Diff line number Diff line change
Expand Up @@ -533,7 +533,7 @@ def test_serialization() -> None:
fit1 = model.generate_quantities(data=jdata, previous_fit=fit_sampling)

dumped = pickle.dumps(fit1)
shutil.rmtree(fit1.runset._output_dir)
shutil.rmtree(fit1.runset._outdir)
fit2: CmdStanGQ[CmdStanMCMC] = pickle.loads(dumped)
variables1 = fit1.stan_variables()
variables2 = fit2.stan_variables()
Expand Down
Loading