Skip to content

Use generic_trimatdiv!() in ldiv!() for QRSparse#697

Open
JamesWrigley wants to merge 1 commit intoJuliaSparse:mainfrom
JamesWrigley:ldiv-perf
Open

Use generic_trimatdiv!() in ldiv!() for QRSparse#697
JamesWrigley wants to merge 1 commit intoJuliaSparse:mainfrom
JamesWrigley:ldiv-perf

Conversation

@JamesWrigley
Copy link
Copy Markdown
Member

This avoids having to go through UpperTriangular, which requires making a square matrix out of our rectangular F.R matrix. Taking a view of F.R would hit a slow fallback for ldiv!(), and making a new square matrix would cause allocations.

Fixes #696. Written with help from Claude 🤖

Benchmark script adapted from BaseBenchmarks:

m = 10000
n = 9000
getA() = spdiagm(0 => fill(2.0, m),
                 -1 => fill(1.0, m - 1),
                 1 => fill(1.0, m - 1),
                 360 => fill(1.0, m - 360))[:, 1:n]
getB()   = ones(m, 2)

@benchmark qr(A)\B setup=(A=$getA(); B=$getB())

# main
BenchmarkTools.Trial: 28 samples with 1 evaluation per sample.
 Range (min  max):  175.511 ms  218.146 ms  ┊ GC (min  max): 0.41%  21.76%
 Time  (median):     178.119 ms               ┊ GC (median):    0.85%
 Time  (mean ± σ):   181.623 ms ±  10.391 ms  ┊ GC (mean ± σ):  2.55% ±  5.57%

  ██▅  ▂     ▂                                                   
  ███▁▁██▁▅▅▅█▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▁▁▅ ▁
  176 ms           Histogram: frequency by time          218 ms <

 Memory estimate: 45.61 MiB, allocs estimate: 197.

# PR
BenchmarkTools.Trial: 174 samples with 1 evaluation per sample.
 Range (min  max):  18.428 ms  70.734 ms  ┊ GC (min  max):  0.00%  66.01%
 Time  (median):     23.961 ms              ┊ GC (median):     2.37%
 Time  (mean ± σ):   28.509 ms ± 13.099 ms  ┊ GC (mean ± σ):  20.07% ± 20.51%

   █    ▂                                                      
  ▆█▅▆▅██▅▅█▇▅▄▃▃▁▁▁▁▃▂▁▂▃▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▃▁▂▃▁▂▁▁▁▁▁▁▁▃▃▂▂▃ ▂
  18.4 ms         Histogram: frequency by time        68.4 ms <

 Memory estimate: 45.61 MiB, allocs estimate: 197.

This avoids having to go through UpperTriangular, which requires making a square
matrix out of our rectangular F.R matrix. Taking a view of F.R would hit a slow
fallback for ldiv!(), and making a new square matrix would cause allocations.
@JamesWrigley JamesWrigley added performance backport 1.13 Change should be backported to release-1.13 labels Apr 12, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.33%. Comparing base (3ab784c) to head (ab3303f).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #697      +/-   ##
==========================================
- Coverage   84.35%   84.33%   -0.02%     
==========================================
  Files          13       13              
  Lines        9347     9348       +1     
==========================================
- Hits         7885     7884       -1     
- Misses       1462     1464       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dkarrasch
Copy link
Copy Markdown
Member

Would a widened method signature somewhere help (also other cases)?

@JamesWrigley
Copy link
Copy Markdown
Member Author

Hmm I'm not sure, do you mean adding something like SparseMatrixCSCView but for a UnitRange of rows and columns?

@dkarrasch
Copy link
Copy Markdown
Member

It was just a wild guess, like could we modify the call path by changing a method signature to allow the previously used method to be used (again)? In case you know, which ldiv! is directing away from the specialized _generic_trimatdiv!, to the slow fallback? Because that specialized generic one happily accepts SparseMatrixViews.

@JamesWrigley
Copy link
Copy Markdown
Member Author

JamesWrigley commented Apr 13, 2026

Ah I see what you mean 🤔 So according to Cthulhu these are the signatures for the copy and view respectively:

generic_trimatdiv!(C::Vector{Float64},
                   uplo_char(A::UpperTriangular{Float64, SparseMatrixCSC{Float64, Int64}})::Core.Const('U'),
                   isunit_char(A::UpperTriangular{Float64, SparseMatrixCSC{Float64, Int64}})::Core.Const('N'),
                   _wrapperop(parent(A))::Core.Const(identity),
                   _unwrap_at(parent(A))::SparseMatrixCSC{Float64, Int64},
                   B::Vector{Float64})

generic_trimatdiv!(C::Vector{Float64},
                   uplo_char(A::UpperTriangular{Float64, SubArray{Float64, 2, SparseMatrixCSC{Float64, Int64}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false}})::Core.Const('U'),
                   isunit_char(A::UpperTriangular{Float64, SubArray{Float64, 2, SparseMatrixCSC{Float64, Int64}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false}})::Core.Const('N'),
                   _wrapperop(parent(A))::Core.Const(identity),
                   _unwrap_at(parent(A))::SubArray{Float64, 2, SparseMatrixCSC{Float64, Int64}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false},
                   B::Vector{Float64})

I think the issue is that this subarray type has a Base.OneTo{Int} as the first axes type argument: SubArray{Float64, 2, SparseMatrixCSC{Float64, Int64}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}

Whereas SparseMatrixCSCView requires it to be a Base.Slice{Base.OneTo{Int}}, which would be the equivalent of @view F.R[:, ...]. That makes it not match our sparse matrix method and fallback to the LinearAlgebra one. Does it make sense to add limited support for views with a Base.OneTo selection of rows? I guess at least in the case of generic_trimatdiv!() it's safe.

@dkarrasch
Copy link
Copy Markdown
Member

I think that could be another PR. Being able to assume that the whole column is included in the view may be important, given how we store row numbers by column. So, for the time being this should be okay.

@dkarrasch dkarrasch requested a review from rayegun April 14, 2026 09:07
Copy link
Copy Markdown
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and I assume this is well covered by tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 1.13 Change should be backported to release-1.13 performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

QR performance regression

2 participants