-
-
Notifications
You must be signed in to change notification settings - Fork 198
Add Wolfe line search to Laplace approximation #3250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
SteveBronder
wants to merge
99
commits into
develop
Choose a base branch
from
fix/wolfe-zoom1
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,054,089
−2,016
Open
Changes from all commits
Commits
Show all changes
99 commits
Select commit
Hold shift + click to select a range
43c3eef
update the laplace line search to use a more advanced wolfe line sear…
SteveBronder 74a92bc
Merge remote-tracking branch 'origin/develop' into fix/laplace-wolfe-…
SteveBronder 2de7a4f
add data for roach data test
SteveBronder 547288e
update tests
SteveBronder cb5e282
move wolfe to its own file
SteveBronder cdaf700
Merge remote-tracking branch 'origin' into fix/laplace-wolfe-line-search
SteveBronder 6b22c85
update wolfe
SteveBronder a1d0906
Merge remote-tracking branch 'origin' into fix/laplace-wolfe-line-search
SteveBronder 5b6ffff
update to use barzilai borwein step size as initial step size estimate
SteveBronder 8eff766
seperate moto from other lpdf tests
SteveBronder f542cc5
update
SteveBronder c845944
add WolfeInfo
SteveBronder 40f1243
use WolfeInfo for extra data
SteveBronder 6e528d2
put everything for iterations in laplace into structs
SteveBronder d89eeb5
update poisson test
SteveBronder 40d889f
add swap functions
SteveBronder 59b7a2f
cleanup laplace_density_est to reduce repeated code
SteveBronder c73f5aa
update to search for a good initial alpha on a space
SteveBronder 773d417
fix code for wolfe line search
SteveBronder b557dad
update tests for zoom
SteveBronder b18bf87
all tests pass for laplace with new wolfe
SteveBronder 98df588
use log sum of diagonal of U matrix for solver 3 determinant
SteveBronder 929dd47
move update_step to be a user passed function
SteveBronder 2ebb01a
cleanup the laplace code to remove some passed by reference values to…
SteveBronder 3bbcef3
cleanup the laplace code to remove some passed by reference values to…
SteveBronder 66ffec9
update WolfeData with member accessors and use Eval within WolfeData
SteveBronder ff5bee4
update docs for wolfe
SteveBronder 7a7415a
update logic in laplace_marginal_desntiy_est so that final updated va…
SteveBronder 973144a
clang format
SteveBronder cc5d49a
change stepsize of finite difference to use 6th order instead of 2nd …
SteveBronder d759fdd
change moto test gradient relative error
SteveBronder 7b4e3a1
update wolfe tests
SteveBronder dfba08b
allow user to set max number of line search iterations.
SteveBronder 7df0ed1
remove extra copy is laplace_likelihood::theta_grad
SteveBronder 0c92732
cleanup and doc
SteveBronder d19ee8b
clang format
SteveBronder 82e43da
Merge commit '5e698970d52ee9cfe85630cca9794e43ec829cf2' into HEAD
yashikno 22a2210
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot 24e2e19
update finit diff back to original
SteveBronder 7720c7a
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot 63e1700
fix finite_diff_stepsize and lower tolerance for AD tests on inv_Phi,…
SteveBronder a9f17d4
Merge commit 'b82d68ced2e73c8188f3bbf287c1321033103986' into HEAD
yashikno 28c44dd
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot fddf54f
cpplint fixes
SteveBronder 113e2b1
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot 88a8950
fix doxygen docs
SteveBronder c4fcba2
Merge remote-tracking branch 'refs/remotes/origin/fix/wolfe-zoom1' in…
SteveBronder 521145f
update moto tests to not take the gradient with respect to y as some …
SteveBronder d648ee0
update laplace_latent_tol_bernoulli_logit_rng user options orderings.…
SteveBronder 7778307
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot 04b5b2e
handle NA values for obj and grad. Allow for zero line search. allow …
SteveBronder 4117a31
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot affabfa
update step iter
SteveBronder a143355
cleanup zoom in wolfe line search
SteveBronder 95c21d5
Merge commit '85c147ee6adbe58eb9ae1578c0478fcf3da9bf76' into HEAD
yashikno 5ba7426
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot 475c632
update poisson_log_lpmf test to use google test parameterization
SteveBronder 307fb0c
add throw testing in neg_binomial_log_summary
SteveBronder 5038198
breakout the laplace tests so they print nicely
SteveBronder 7e9af37
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot 863223e
fix cpplint
SteveBronder 04197f2
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot c8a1613
address partial review comments
SteveBronder aeb1662
Merge commit 'a5f80224b857e06dd7ca753d826e5b292ee8e73c' into HEAD
yashikno ea9ffe0
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot 7f7bbb2
update refactor for laplace_marginal_density_est
SteveBronder 66e8470
update with initial shim
SteveBronder 6acdd09
update with initial shim
SteveBronder 9dc118e
update with initial shim
SteveBronder b9a493a
update with initial shim
SteveBronder 7c886f5
update with initial shim
SteveBronder 5eb664b
update with initial shim
SteveBronder 1f7ff3c
update with initial shim
SteveBronder 65914d3
update with initial shim
SteveBronder 6b74be8
update with initial shim
SteveBronder e9cad2e
update
SteveBronder db4f677
Merge commit 'a6932f2a8ff8d1800c7020d29593b5934a20e40b' into HEAD
yashikno aa73b5d
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot cf151fa
remove extra comments
SteveBronder 0022b41
Moves many of the functions built for laplace into seperate files for…
SteveBronder 0e1b7f4
Merge commit '52f7e9244d5367b74e4b4274d2a1c0e9f15a23f9' into HEAD
yashikno d26626f
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot a2a0a99
add initializers to solvers for laplace
SteveBronder a4f0b7d
fix docs for laplace_marginal_density_est
SteveBronder 4c884dd
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot e6b2d74
minor cleanup
SteveBronder 27d2fc9
minor cleanup
SteveBronder f845a43
cleanup retry logic on NaN of Inf objective function values
SteveBronder 200fd9a
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot d6859f1
adds fallthrough option in user opts. User ops are now a tuple the us…
SteveBronder d2de83d
Merge remote-tracking branch 'refs/remotes/origin/fix/wolfe-zoom1' in…
SteveBronder 7f1a995
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot d86fa69
update docs
SteveBronder 6abd9da
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot b30823d
remove .csv files for .hpp files
SteveBronder dd1e74e
Merge remote-tracking branch 'refs/remotes/origin/fix/wolfe-zoom1' in…
SteveBronder b292cae
[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1
stan-buildbot ed3ce3a
make non-static so it's memory is allocated dynamically.
SteveBronder d9319ad
remove rev/core/set_zero_adjoints.hpp
SteveBronder File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,124 @@ | ||
| #ifndef STAN_MATH_MIX_FUNCTOR_BARZILAI_BORWEIN_STEP_SIZE_HPP | ||
| #define STAN_MATH_MIX_FUNCTOR_BARZILAI_BORWEIN_STEP_SIZE_HPP | ||
| #include <stan/math/prim/fun/Eigen.hpp> | ||
| #include <algorithm> | ||
| #include <numeric> | ||
| #include <cmath> | ||
|
|
||
| namespace stan::math::internal { | ||
| /** | ||
| * @brief Curvature-aware Barzilai–Borwein (BB) step length with robust | ||
| * safeguards. | ||
| * | ||
| * Given successive parameter displacements \f$s = x_k - x_{k-1}\f$ and | ||
| * gradients \f$g_k\f$, \f$g_{k-1}\f$, this routine forms | ||
| * \f$y = g_k - g_{k-1}\f$ and computes the two classical BB candidates | ||
| * | ||
| * \f{align*}{ | ||
| * \alpha_{\text{BB1}} &= \frac{\langle s,s\rangle}{\langle s,y\rangle},\\ | ||
| * \alpha_{\text{BB2}} &= \frac{\langle s,y\rangle}{\langle y,y\rangle}, | ||
| * \f} | ||
| * | ||
| * then chooses between them using the **spectral cosine** | ||
| * \f$r = \cos^2\!\angle(s,y) = \dfrac{\langle s,y\rangle^2} | ||
| * {\langle s,s\rangle\,\langle | ||
| * y,y\rangle}\in[0,1]\f$: | ||
| * | ||
| * - if \f$r > 0.9\f$ (well-aligned curvature) and the previous line search | ||
| * did **≤ 1** backtrack, prefer the “long” step \f$\alpha_{\text{BB1}}\f$; | ||
| * - if \f$0.1 \le r \le 0.9\f$, take the neutral geometric mean | ||
| * \f$\sqrt{\alpha_{\text{BB1}}\alpha_{\text{BB2}}}\f$; | ||
| * - otherwise default to the “short” step \f$\alpha_{\text{BB2}}\f$. | ||
| * | ||
| * All candidates are clamped into \f$[\text{min\_alpha},\,\text{max\_alpha}]\f$ | ||
| * and must be finite and positive. | ||
| * If the curvature scalars are ill-posed (non-finite or too small), | ||
| * \f$\langle s,y\rangle \le \varepsilon\f$, or if `last_backtracks == 99` | ||
| * (explicitly disabling BB for this iteration), the function falls back to a | ||
| * **safe** step: | ||
| * use `prev_step` when finite and positive, otherwise \f$1.0\f$, then clamp to | ||
| * \f$[\text{min\_alpha},\,\text{max\_alpha}]\f$. | ||
| * | ||
| * @param s Displacement between consecutive iterates | ||
| * (\f$s = x_k - x_{k-1}\f$). | ||
| * @param g_curr Gradient at the current iterate \f$g_k\f$. | ||
| * @param g_prev Gradient at the previous iterate \f$g_{k-1}\f$. | ||
| * @param prev_step Previously accepted step length (used by the fallback). | ||
| * @param last_backtracks | ||
| * Number of backtracking contractions performed by the most | ||
| * recent line search; set to 99 to force the safe fallback. | ||
| * @param min_alpha Lower bound for the returned step length. | ||
| * @param max_alpha Upper bound for the returned step length. | ||
| * | ||
| * @return A finite, positive BB-style step length \f$\alpha \in | ||
| * [\text{min\_alpha},\,\text{max\_alpha}]\f$ suitable for seeding a | ||
| * line search or as a spectral preconditioner scale. | ||
| * | ||
| * @note Uses \f$\varepsilon=10^{-16}\f$ to guard against division by very | ||
| * small curvature terms, and applies `std::abs` to BB ratios to avoid | ||
| * negative steps; descent is enforced by the line search. | ||
| * @warning The vectors must have identical size. Non-finite inputs yield the | ||
| * safe fallback. | ||
| */ | ||
| inline double barzilai_borwein_step_size(const Eigen::VectorXd& s, | ||
| const Eigen::VectorXd& g_curr, | ||
| const Eigen::VectorXd& g_prev, | ||
| double prev_step, int last_backtracks, | ||
| double min_alpha, double max_alpha) { | ||
| // Fallbacks | ||
| auto safe_fallback = [&]() -> double { | ||
| double a = std::clamp( | ||
| prev_step > 0.0 && std::isfinite(prev_step) ? prev_step : 1.0, | ||
| min_alpha, max_alpha); | ||
| return a; | ||
| }; | ||
|
|
||
| const Eigen::VectorXd y = g_curr - g_prev; | ||
| const double sty = s.dot(y); | ||
| const double sts = s.squaredNorm(); | ||
| const double yty = y.squaredNorm(); | ||
|
|
||
| // Basic validity checks | ||
| constexpr double eps = 1e-16; | ||
| if (!(std::isfinite(sty) && std::isfinite(sts) && std::isfinite(yty)) | ||
| || sts <= eps || yty <= eps || sty <= eps || last_backtracks == 99) { | ||
| return safe_fallback(); | ||
| } | ||
|
|
||
| // BB candidates | ||
| double alpha_bb1 = std::clamp(std::abs(sts / sty), min_alpha, max_alpha); | ||
| double alpha_bb2 = std::clamp(std::abs(sty / yty), min_alpha, max_alpha); | ||
|
|
||
| // Safeguard candidates | ||
| if (!std::isfinite(alpha_bb1) || !std::isfinite(alpha_bb2) || alpha_bb1 <= 0.0 | ||
| || alpha_bb2 <= 0.0) { | ||
| return safe_fallback(); | ||
| } | ||
|
|
||
| // Spectral cosine r = cos^2(angle(s, y)) in [0,1] | ||
| const double r = (sty * sty) / (sts * yty); | ||
|
|
||
| // Heuristic thresholds (robust defaults) | ||
| constexpr double kLoose = 0.9; // "nice" curvature | ||
| constexpr double kTight = 0.1; // "dodgy" curvature | ||
|
|
||
| double alpha0 = alpha_bb2; // default to short BB for robustness | ||
| if (r > kLoose && last_backtracks <= 1) { | ||
| // Spectrum looks friendly and line search was not harsh -> try long BB | ||
| alpha0 = alpha_bb1; | ||
| } else if (r >= kTight && r <= kLoose) { | ||
| // Neither clearly friendly nor clearly dodgy -> neutral middle | ||
| alpha0 = std::sqrt(alpha_bb1 * alpha_bb2); | ||
| } // else keep alpha_bb2 | ||
|
|
||
| // Clip to user bounds | ||
| alpha0 = std::clamp(alpha0, min_alpha, max_alpha); | ||
|
|
||
| if (!std::isfinite(alpha0) || alpha0 <= 0.0) { | ||
| return safe_fallback(); | ||
| } | ||
| return alpha0; | ||
| } | ||
|
|
||
| } // namespace stan::math::internal | ||
| #endif |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,106 @@ | ||
| #ifndef STAN_MATH_MIX_FUNCTOR_CONDITIONAL_COPY_AND_PROMOTE_HPP | ||
| #define STAN_MATH_MIX_FUNCTOR_CONDITIONAL_COPY_AND_PROMOTE_HPP | ||
|
|
||
| #include <stan/math/mix/functor/hessian_block_diag.hpp> | ||
| #include <stan/math/prim/functor.hpp> | ||
| #include <stan/math/prim/fun.hpp> | ||
|
|
||
| namespace stan::math::internal { | ||
|
|
||
| /** | ||
| * Decide if object should be deep or shallow copied when | ||
| * using @ref conditional_copy_and_promote . | ||
| */ | ||
| enum class COPY_TYPE : uint8_t { SHALLOW = 0, DEEP = 1 }; | ||
|
|
||
| /** | ||
| * Conditional copy and promote a type's scalar type to a `PromotedType`. | ||
| * @tparam Filter type trait with a static constexpr bool member `value` | ||
| * that is true if the type should be promoted. Otherwise, the type is | ||
| * left unchanged. | ||
| * @tparam PromotedType type to promote the scalar to. | ||
| * @tparam CopyType type of copy to perform. | ||
| * @tparam Args variadic arguments. | ||
| * @param args variadic arguments to conditionally copy and promote. | ||
| * @return a tuple where each element is either a reference to the original | ||
| * argument or a promoted copy of the argument. | ||
| */ | ||
| template <template <typename...> class Filter, | ||
| typename PromotedType = stan::math::var, | ||
| COPY_TYPE CopyType = COPY_TYPE::DEEP, typename... Args> | ||
| inline auto conditional_copy_and_promote(Args&&... args) { | ||
| return map_if<Filter>( | ||
| [](auto&& arg) { | ||
| if constexpr (is_tuple_v<decltype(arg)>) { | ||
| return stan::math::apply( | ||
| [](auto&&... inner_args) { | ||
| return make_holder_tuple( | ||
| conditional_copy_and_promote<Filter, PromotedType, | ||
| CopyType>( | ||
| std::forward<decltype(inner_args)>(inner_args))...); | ||
| }, | ||
| std::forward<decltype(arg)>(arg)); | ||
| } else if constexpr (is_std_vector_v<decltype(arg)>) { | ||
| std::vector<decltype(conditional_copy_and_promote< | ||
| Filter, PromotedType, CopyType>(arg[0]))> | ||
| ret; | ||
| for (std::size_t i = 0; i < arg.size(); ++i) { | ||
| ret.push_back( | ||
| conditional_copy_and_promote<Filter, PromotedType, CopyType>( | ||
| arg[i])); | ||
| } | ||
| return ret; | ||
| } else { | ||
| if constexpr (CopyType == COPY_TYPE::DEEP) { | ||
| return stan::math::eval(promote_scalar<PromotedType>( | ||
| value_of_rec(std::forward<decltype(arg)>(arg)))); | ||
| } else if (CopyType == COPY_TYPE::SHALLOW) { | ||
| if constexpr (std::is_same_v<PromotedType, | ||
| scalar_type_t<decltype(arg)>>) { | ||
| return std::forward<decltype(arg)>(arg); | ||
| } else { | ||
| return stan::math::eval(promote_scalar<PromotedType>( | ||
| std::forward<decltype(arg)>(arg))); | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| std::forward<Args>(args)...); | ||
| } | ||
|
|
||
| /** | ||
| * Conditional deep copy types with a `var` scalar type to `PromotedType`. | ||
| * @tparam PromotedType type to promote the scalar to. | ||
| * @tparam Args variadic arguments. | ||
| * @param args variadic arguments to conditionally copy and promote. | ||
| * @return a tuple where each element is either a reference to the original | ||
| * argument or a promoted copy of the argument. | ||
| */ | ||
| template <typename PromotedType, typename... Args> | ||
| inline auto deep_copy_vargs(Args&&... args) { | ||
| return conditional_copy_and_promote<is_any_var_scalar, PromotedType, | ||
| COPY_TYPE::DEEP>( | ||
| std::forward<Args>(args)...); | ||
| } | ||
|
|
||
| /** | ||
| * Conditional shallow copy types with a `var` scalar type to `PromotedType`. | ||
| * @note This function is useful whenever you are inside of nested autodiff | ||
| * and want to allow the input arguments from an outer autodiff to be used | ||
| * in an inner autodiff without making a hard copy of the input arguments. | ||
| * @tparam PromotedType type to promote the scalar to. | ||
| * @tparam Args variadic arguments. | ||
| * @param args variadic arguments to conditionally copy and promote. | ||
| * @return a tuple where each element is either a reference to the original | ||
| * argument or a promoted copy of the argument. | ||
| */ | ||
| template <typename PromotedType, typename... Args> | ||
| inline auto shallow_copy_vargs(Args&&... args) { | ||
| return conditional_copy_and_promote<is_any_var_scalar, PromotedType, | ||
| COPY_TYPE::SHALLOW>( | ||
| std::forward<Args>(args)...); | ||
| } | ||
|
|
||
| } // namespace stan::math::internal | ||
|
|
||
| #endif | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: should the
internalnamespace end here? deep/shallow copy look much more like normal functions thanconditional_copy_and_promotedoesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only used in other internal functions so I think it is better to have in internal