ad embedded expression tests to expect_ad testing framework#2837
ad embedded expression tests to expect_ad testing framework#2837SteveBronder wants to merge 83 commits intodevelopfrom
Conversation
…ev/math into feature/inbedded-expression-tests
…ndled by the ad testing framework
…rmal version is executed
…rmal version is executed
Yeah, I just wish if we were changing this time in increments of 2 hours it would be going the other way. I think a shorter feedback loop would be dramatically beneficial for contributors, even if it was still the kind of thing where you only got 2 tries in a work day (versus the current ~1 try). If we're willing to just throw up our hands and say that will never happen, then yes, two hours here or there doesn't matter that much until we start saturating the build machines
We could also have a flag (kind of like the current |
I think if we want this as a goal what we could do is have a "quick mode" that is the default when a PR is opened that reads the diff, looks at the name of the files changed, then greps and only runs the tests with the same name in the test folder. And turn off all upstream testing. Then after an initial review we fall over to "test everything mode" which runs overnight. Or alt we could have the tests and headers only include strictly what they need. Then we can kind of do the same thing as above but with a more broad scope of what tests need to be run. We used to have a setup like that but header errors were extremely annoying so we opted for the "include everything" scheme.
Is there a way to tell github / jenkins "after this PR is approved run the tests one more time with this new flag and do not allow a merge until all tests pass"? That would be a pretty easy / automated way to turn this on and off |
That is already in place, sort of. It just continues on to run everything else after those initial tests. It does make some subset of possible errors fail-fast, which is nice. Being much stricter about including what we use would let us run exactly the correct subset of tests, which would be nice. @syclik and I have talked about this a couple times, especially for the distribution tests which also take a huge chunk of time to run.
I know a lot of projects use a merge bot for things like this. When a approving review is submitted, rather than click merge themselves they leave a comment like |
|
Just catching up. Given what I've read, I think it's sensible to do two things:
How do we get this moving forward? The loss of two hours of wait time to prevent upstream failures is worth it, I think. |
I just wanted to make sure it was clear: these upstream failures are in PRs, not in anything user-facing. So it prevents some developer frustration and churn here in that situation. The same thing could be prevented if the user was told anywhere how to preemptively run the tests, which is what the discussion in #2736 was about |
|
Needing to compile all of The |
|
It would be possible, but I don't think that is an easy change in the current test infrastructure, since Care would need to be taken to make sure the output was still readable from this |
I was thinking of the steps:
That way the compilation is still parallelised, but each call is only building and immediately running one test. Whether or not this would be particularly 'clean', I'm not sure |
|
I think something of that flavor could work as long as anything shared between tests (like TBB or Sundials) was already built, otherwise you'd get filesystem races. Worth exploring things like that. |
…s' into feature/inbedded-expression-tests
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
|
@WardBrian do you have any idea what is going on with the benchmark tests failing? I'm not seeing an error in the logs? |
|
It looks like the failure is coming from Seems weird? |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
|
@WardBrian I think this is good to merge after the release is finished |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
Summary
This adds a simple version of the expression testing framework directly into the call to
expect_ad(). This helps fix #2736 by making sure any function that goes through the expression tests will also go check that it is safe to pass Eigen expressions to.The files for all of this are in
test/unit/math/expr_tests.hppwith the main function beingcheck_expr_test(...). This function takes in any set of arguments and for each type that is an Eigen matrix or vector it checks if the function is 'safe' to pass expressions to. The word 'safe' in this context means that, if an unary expression was passed as an argument to the function, is it only called as many times as there are individual elements in the underlying Eigen object? The original Eigen object is wrapped in an unaryFunction that calls thecounterOpfunctor from the original expression testing framework.Tests
Tests can be run with
The tests include examples for one, two, and three inputs to functions. They check that for each combination of inputs we fail the correct number of Eigen objects passed to each function. Each set of inputs also has an associated test that just runs a 'good' funtion which does not fail the tests.
Side Effects
This will increase compile times a bit
Release notes
Adds a simple expression tests framework to the autodiff test framework
Checklist
Math issue Document Stan Math expressions requirements #2736
Copyright holder: Steve Bronder
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit)make test-headers)make test-math-dependencies)make doxygen)make cpplint)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested