Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/content/punpy_digital_effects_table.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ or Law of Propagation of Uncertainties (LPU) method (see Section :ref:`LPUMethod

gl = IdealGasLaw(prop=prop)

If no argument is provided for prop, a MCPropagation(100,parallel_cores=0) object is used.
If no argument is provided for prop, a MCPropagation(100,parallel_cores=1) object is used.
The next arguments are for providing the input quantity names and the measurand name and measurand unit respectively::

gl = IdealGasLaw(prop=prop, xvariables=["pressure", "temperature", "n_moles"], yvariable="volume", yunit="m^3")
Expand Down
24 changes: 22 additions & 2 deletions docs/content/punpy_memory_and_speed.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,10 +137,30 @@ In some cases, when there are multiple measurands with different shapes, it is n
In such cases, the `refyvar` keyword should be set to the index of the measurand with the most dimensions and the repeat_dims indexes should correspond to this measurand.


Processing the MC samples simultaneously
########################################
For simple measurement functions (e.g. that just do arithmic operations such as +,-,*,/) with numpy arrays, the most computationally efficient way to propagate uncertainties is to
pass all the MC samples simultaneously to the measurement function. The measurement thus takes as arguments numpy arrays consisting of all MC steps for a given
input quantity, and will return a numpy array with all the MC samples of the measurand. In order to use this method, set the optional `parallel_cores` keyword to 0.
By using this method we can then benefit from the optimised array operations within numpy
(which use all available CPUS, and rely on C code instead of Python), which is significantly faster than any available alternative.
This method is recommended for any measurement function that allows it (i.e. if the measurment function can deal with the additional dimension in the input quantities and through numpy operations the measurand will simply end up with the same additional dimension.)

When parallel_cores is set to 0, all iterations will be processed simultaneously and there will be an additional dimension for the MC iterations.
Generally within punpy, the MC dimension in the samples is the first one (i.e. internally as well as when MC samples are returned).
However, when processing all iterations simultaniously, in most cases it is more practical to have the MC dimension as the last dimension.
This is because we use numpy arrays and these are compatible when the last dimensions match following the `numpy broadcasting rules <https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules>`_.
So as default, the shape of the input quantities when using parallel_cores=0 will have the MC iterations as its last dimension.
However, in some cases it is more helpful to have the MC iterations as the first dimension.
If this is the case, the MC iteration dimension can be made the first dimension by setting the `MClastdim` keyword to False::

prop = punpy.MCPropagation(1000,parallel_cores=0,MCdimlast=False)


Processing the MC samples in parallel
######################################
At the start of this section we already saw that the optional `parallel_cores` keyword can be used to running the MC
samples one-by-one through the measurement function rather than all at once as in the standard case. It is also possible
In the previous section we already saw that the optional `parallel_cores` keyword can be used to running the MC
samples all at once through the measurement function rather than one-by-one as in the standard case. It is also possible
to use the same keyword to use parallel processing. Here, only the processing of the input quantities through the measurement
function is done in parallel. Generating the samples and calculating the covariance matrix etc is still done as normal.
Punpy uses the multiprocessing module which comes standard with your python distribution.
Expand Down
33 changes: 12 additions & 21 deletions docs/content/punpy_standalone.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,9 @@ first create a prop object (object of punpy MCPropagation of LPUPropagation clas
prop=punpy.MCPropagation(10000) # Here the number is how
# many MC samples will be generated

# Or if you have a measurement function that does not accept
# Or if you have a measurement function that accepts
# higher dimensional arrays as argument:
prop=punpy.MCPropagation(10000,parallel_cores=1)
prop=punpy.MCPropagation(10000,parallel_cores=0)

# Alternatively it is possible to use LPU methods to
# propagate uncertainties
Expand All @@ -82,15 +82,18 @@ GUM (Guide to the Expression of Uncertainty in Measurement) by calculating the J
For the MC method, the number of MC samples that is used is set as the first argument when creating the MCPropagation object (see example above).
Two approaches can be followed to propagate the MC samples of the input quantities to the measurand, depending on whether the measurement function can be applied to numpy arrays of arbitrary size.

The most computationally efficient way is to pass an array consisting of all MC steps of an
input quantity instead of the input quantity themselves. Each of the input quantities will thus get an additional dimension in the MC sample.
By default, MC samples are processed one-by-one. This is equivalent to setting the optional
`parallel_cores` keyword to 1. However, some measurement functions allow to process all the MC samples simultaneously.
When this is possible, it is the most computationally efficient way to propagate MC uncertainties.
In this case, a numpy array consisting of all MC steps of an input quantity is passed to the measurement function
instead of the input quantity themselves. Each of the input quantities will thus get an additional MC dimension and these higher
dimension input quantities are used as arguments to the measurement function, which will in turn return a measurand with this additional MC dimension.
If the measurement function can deal with these higher dimensional arrays by just performing numpy operations, this gives the most computationally efficient MC propagation.
This method is the default in punpy (which corresponds to setting the optional `parallel_cores` keyword to 0).
In order to use this method, the optional `parallel_cores` keyword is to 0.

However, this is not the case for every measurement function. If the inputs to the measurement
function are less flexible, and don't support additional dimensions, it is possible to instead run the MC samples one by one.
In order to pass each MC sample individually to the measurement function, it is possible to set the optional
`parallel_cores` keyword to 1. In :ref:`punpy_memory_and_speed`, we will show how the same keyword can be used to do parallel processing for such measurement functions.
If the inputs to the measurement function are less flexible, and don't support additional dimensions,
one can either use the default option (`parallel_cores=1`; run the MC samples one by one) or use parallel processing (`parallel_cores>1`).
In :ref:`punpy_memory_and_speed`, we provide some more detail on processing the MC samples simultaniously or in parallel.


For the LPU methods, the numdifftools package (used within comet_maths) is used to calculate the Jacobian. This package automatically determines the stepsize in the numerical
Expand Down Expand Up @@ -341,15 +344,3 @@ However, it is also possible to ignore all MC samples where any of the values ar
This can be done by setting the `allow_some_nans` keyword to False.


Shape of input quanties within the measurement function
##########################################################
When setting parallel_cores to 1 or more, the shape of the input quantities used for each iteration in the measurement function matches the shape of the input quantities themselves.
However, when parallel_cores is set to 0, all iterations will be processed simultaneously and there will be an additional dimension for the MC iterations.
Generally within punpy, the MC dimension in the samples is the first one (i.e. internally as well as when MC samples are returned).
However, when processing all iterations simultaniously, in most cases it is more practical to have the MC dimension as the last dimension.
This is because we use numpy arrays and these are compatible when the last dimensions match following the `numpy broadcasting rules <https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules>`_.
So as default, the shape of the input quantities when using parallel_cores will have the MC iterations as its last dimension.
However, in some cases it is more helpful to have the MC iterations as the first dimension.
If this is the case, the MC iteration dimension can be made the first dimension by setting the `MClastdim` keyword to False::

prop = punpy.MCPropagation(1000,MCdimlast=False)
Binary file added propagate_ds_example.nc
Binary file not shown.
2 changes: 1 addition & 1 deletion punpy/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "1.0.7"
__version__ = "1.1.0"
Binary file not shown.
2 changes: 1 addition & 1 deletion punpy/lpu/lpu_propagation.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ class LPUPropagation:
:type verbose: bool
"""

def __init__(self, parallel_cores=0, Jx_diag=False, step=None, verbose=False):
def __init__(self, parallel_cores=1, Jx_diag=False, step=None, verbose=False):
self.parallel_cores = parallel_cores
self.Jx_diag = Jx_diag
self.step = step
Expand Down
169 changes: 91 additions & 78 deletions punpy/mc/mc_propagation.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ class MCPropagation:
"""

def __init__(
self, steps, parallel_cores=0, dtype=None, verbose=False, MCdimlast=True
self, steps, parallel_cores=1, dtype=None, verbose=False, MCdimlast=True
):
self.MCsteps = steps
self.parallel_cores = parallel_cores
Expand Down Expand Up @@ -1443,6 +1443,7 @@ def run_samples(
start=None,
end=None,
sli=None,
return_objects=False,
allow_some_nans=True,
):
"""
Expand Down Expand Up @@ -1490,100 +1491,112 @@ def run_samples(
MC_y = self.pool.starmap(func, MC_x2)

if output_vars == 1:
if allow_some_nans:
if return_objects:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.any(np.isfinite(MC_y[i]))
],
dtype=self.dtype,
[MC_y[i] for i in range(len(indices))],
dtype=object,
)
else:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(np.isfinite(MC_y[i]))
],
dtype=self.dtype,
)
else:
if allow_some_nans:
try:
if allow_some_nans:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(
[
np.any(np.isfinite(MC_y[i][ivar]))
for ivar in range(output_vars)
if (
hasattr(MC_y[i][ivar], "__len__")
and (len(MC_y[i][ivar]) > 0)
)
]
)
if np.any(np.isfinite(MC_y[i]))
],
dtype=self.dtype,
)
except:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(
[
np.any(np.isfinite(MC_y[i][ivar]))
for ivar in range(output_vars)
if (
hasattr(MC_y[i][ivar], "__len__")
and (len(MC_y[i][ivar]) > 0)
)
]
)
],
dtype=object,
)

else:
try:
else:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(
[
np.all(np.isfinite(MC_y[i][ivar]))
for ivar in range(output_vars)
if (
hasattr(MC_y[i][ivar], "__len__")
and (len(MC_y[i][ivar]) > 0)
)
]
)
if np.all(np.isfinite(MC_y[i]))
],
dtype=self.dtype,
)
except:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(
[
np.all(np.isfinite(MC_y[i][ivar]))
for ivar in range(output_vars)
if (
hasattr(MC_y[i][ivar], "__len__")
and (len(MC_y[i][ivar]) > 0)
)
]
)
],
dtype=object,
)
else:
if return_objects:
MC_y_out = np.array(
[MC_y[i] for i in range(len(indices))],
dtype=object,
)
else:
if allow_some_nans:
try:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(
[
np.any(np.isfinite(MC_y[i][ivar]))
for ivar in range(output_vars)
if (
hasattr(MC_y[i][ivar], "__len__")
and (len(MC_y[i][ivar]) > 0)
)
]
)
],
dtype=self.dtype,
)
except:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(
[
np.any(np.isfinite(MC_y[i][ivar]))
for ivar in range(output_vars)
if (
hasattr(MC_y[i][ivar], "__len__")
and (len(MC_y[i][ivar]) > 0)
)
]
)
],
dtype=object,
)

else:
try:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(
[
np.all(np.isfinite(MC_y[i][ivar]))
for ivar in range(output_vars)
if (
hasattr(MC_y[i][ivar], "__len__")
and (len(MC_y[i][ivar]) > 0)
)
]
)
],
dtype=self.dtype,
)
except:
MC_y_out = np.array(
[
MC_y[i]
for i in range(len(indices))
if np.all(
[
np.all(np.isfinite(MC_y[i][ivar]))
for ivar in range(output_vars)
if (
hasattr(MC_y[i][ivar], "__len__")
and (len(MC_y[i][ivar]) > 0)
)
]
)
],
dtype=object,
)

if len(MC_y_out) < len(indices):
if allow_some_nans:
Expand Down
5 changes: 5 additions & 0 deletions punpy/mc/tests/test_mc_propagation.py
Original file line number Diff line number Diff line change
Expand Up @@ -763,6 +763,11 @@ def test_separate_processing(self):
MC_x = prop.generate_MC_sample(xsd, xerrsd, corrd)
MC_y1 = prop.run_samples(functiond, MC_x, output_vars=2, start=0, end=10000)
MC_y2 = prop.run_samples(functiond, MC_x, output_vars=2, start=10000, end=20000)
MC_y3 = prop.run_samples(
functiond, MC_x, output_vars=2, start=0, end=10000, return_objects=True
)
assert isinstance(MC_y3, object)

MC_y = prop.combine_samples([MC_y1, MC_y2])

ufd, ucorrd, corr_out = prop.process_samples(
Expand Down