Skip to content

Conversation

@lkdvos
Copy link
Member

@lkdvos lkdvos commented Nov 5, 2025

Attempt to address #89

I've updated the tolerances across all pullbacks to follow this logic:

  • degeneracy and rank are determined by default relative to the norm
  • gauge are determined relative to 1

Or, in other words, gauge tolerances are absolute while degeneracy and rank tolerances are relative.
I would have really liked to have a better relative criterion for the gauge tolerance as well, but can't really come up with a satisfactory solution.
I'm guessing that the idea is that the cotangent in the direction of the gauge should be small compared to the total norm of the cotangent (is that even true?), so possibly we could consider an approach like that, but already this should be an improvement.

@codecov
Copy link

codecov bot commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 93.33333% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/common/defaults.jl 85.71% 1 Missing ⚠️
Files with missing lines Coverage Δ
src/pullbacks/eig.jl 95.94% <ø> (ø)
src/pullbacks/eigh.jl 91.04% <100.00%> (ø)
src/pullbacks/lq.jl 95.31% <100.00%> (ø)
src/pullbacks/qr.jl 95.23% <100.00%> (ø)
src/pullbacks/svd.jl 96.26% <ø> (ø)
src/common/defaults.jl 87.50% <85.71%> (-12.50%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Jutho
Copy link
Member

Jutho commented Nov 5, 2025

I do think it makes sense if gauge_tol would be relative to the norm of the associated pullback variable that can have gauge-dependent components, i.e. ΔQ in the QR/LQ, ΔV in eig/eigh and something like max(norm(ΔU, Inf), norm(ΔVᴴ)) for svd.

return eps(eltype(n))^(3 / 4) * max(n, one(n))
end
default_pullback_gauge_atol(A) = eps(norm(A, Inf))^(3 / 4)
default_pullback_gauge_atol(A, As...) = maximum(default_pullback_gauge_atol, (A, As...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this multi-argument definition is necessary (I realised this yesterday after closing my computer and going to bed). Simply dong default_pullback_gauge_atol((ΔU, ΔVᴴ)) should be ok; norm((a,b), Inf) is automatically compiled to max(norm(a, Inf), norm(b, Inf)).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. If I am very nit-picky, I would say this leads to a type instability in the case of one of the two adjoints of the svd being zero, but I assume this is ok, as it will also not propagate to the output of the functions where this gauge_tol is being used, as it is just in a comparison.

julia> @code_warntype default_pullback_gauge_atol(ZeroTangent(), randn(5,5))
MethodInstance for default_pullback_gauge_atol(::ZeroTangent, ::Matrix{Float64})
  from default_pullback_gauge_atol(A, As...) @ Main REPL[5]:1
Arguments
  #self#::Core.Const(Main.default_pullback_gauge_atol)
  A::Core.Const(ZeroTangent())
  As::Tuple{Matrix{Float64}}
Body::Any
1 ─ %1 = Main.maximum::Core.Const(maximum)
│   %2 = Main.default_pullback_gauge_atol::Core.Const(Main.default_pullback_gauge_atol)
│   %3 = Core.tuple(A)::Core.Const((ZeroTangent(),))
│   %4 = Core._apply_iterate(Base.iterate, Core.tuple, %3, As)::Tuple{ZeroTangent, Matrix{Float64}}
│   %5 = (%1)(%2, %4)::Any
└──      return %5

@Jutho
Copy link
Member

Jutho commented Nov 6, 2025

I am wondering if this is tested where one of the adjoint variables is ZeroTangent(). We will probably get errors as norm(ZeroTangent(), Inf) does not seem to be defined. It shoudn't matter what the result is, because the gauge_atol is not used if that adjoint variable is ZeroTangent(), but it might give errors nonetheless.

@lkdvos
Copy link
Member Author

lkdvos commented Nov 6, 2025

I think this should now be handled, although it still has the multi-argument definition to make sure that we are actually intercepting the call to norm. It's not super clean, but I think this should be reasonable.

@lkdvos
Copy link
Member Author

lkdvos commented Nov 6, 2025

I tried again, I think this should now be type stable on the condition that iszerotangent constant folds for the given types, which I think should be fine to assume.

@Jutho
Copy link
Member

Jutho commented Nov 6, 2025

@code_warntype seems to be happy with the new version.

@lkdvos lkdvos merged commit c99084b into main Nov 6, 2025
10 checks passed
@lkdvos lkdvos deleted the ld-tol branch November 6, 2025 21:11
@lkdvos lkdvos mentioned this pull request Nov 14, 2025
lkdvos referenced this pull request Nov 14, 2025
* Bump v0.6

* rename `gaugefix` -> `fixgauge`

* reduce unnecessary warnings

* fix `copy_input` signatures in Mooncake tests

* Add changelog to docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] SVD pullback does not properly take into account the norm of the initial matrix safe_inv should have a different tol from pullback_gaugetol

3 participants