feat(autogram): Remove batched optimizations#470
Merged
ValerianRey merged 8 commits intodev-new-enginefrom Oct 23, 2025
Merged
Conversation
* Remove FunctionalJacobianComputer * Remove args and kwargs from interface of JacobianComputer, GramianComputer and JacobianAccumulator because they were only needed for the functional interface * Remove kwargs from interface of Hook and stop registering it with with_kwargs=True (args are mandatory though, so rename them as _). * Change JacobianComputer to compute generalized jacobians (shape [m0, ..., mk, n]) and change GramianComputer to compute optional generalized gramians (shape [m0, ..., mk, mk, ..., m0]) * Change engine.compute_gramian to always simply do one vmap level per dimension of the output, without caring about the batch_dim. * Remove all reshapes and movedims in engine.compute_gramian: we don't need reshape anymore since the gramian is directly a generalized gramian, and we dont need movedim anymore since we vmap on all dimensions the same way, without having to put the non-batched dim in front. Merge compute_gramian and _compute_square_gramian. * Use a DiagonalSparseTensor as initial jac_output of compute_gramian.
Codecov Report✅ All modified and coverable lines are covered by tests.
🚀 New features to boost your workflow:
|
Contributor
Author
|
This PR is a pre-requesite to be able to use DiagonalSparseTensors. It highly simplifies the engine, making all the necessary changes so that the optimization is now all about what type of tensor we give a jac_output. So in a future PR (after #466 is merged), we will be able to simply change: jac_output = _make_initial_jac_output(output)by jac_output = DiagonalSparseTensor(...)and to remove In fact, it even works if we cherry-pick this into #466 and use a DiagonalSparseTensor as jac_output, but it's densified super quickly so it's not really using sparsity. |
ValerianRey
commented
Oct 23, 2025
ValerianRey
commented
Oct 23, 2025
ValerianRey
commented
Oct 23, 2025
PierreQuinton
approved these changes
Oct 23, 2025
Contributor
PierreQuinton
left a comment
There was a problem hiding this comment.
Bah bravo Nils, le code il a disparu mais c'est pas grave. (LGTM after few discussions)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Basically the idea of this PR is to remove the batched optimization, because this optimization should be made internally by backpropagating a diagonal sparse jacobian.
Compared to main, this simplifies a lot of things. The batched optimization was done in FunctionalJacobianComputer, but required a different usage compared to AutogradJacobianComputer, which made the engine require special cases based on the batch_dim, which in turn required the user to provide the batch_dim. I think all of this can be dropped.