Skip to content

Conversation

@ogauthe
Copy link
Contributor

@ogauthe ogauthe commented Aug 15, 2025

This PR is a follow-up to #229 and #237. It replaces @autoopt by explicit contraction scheme in CTMRG partition function contractions. I assumed the optimal permutation was the same as for a wavefunction and reproduced the same order.

I do not really understand which constraints are imposed by planar non-braiding categories, so I may be doing illegal permutations. I do not know how to check for these, I am happy to learn.

@codecov
Copy link

codecov bot commented Aug 15, 2025

Codecov Report

❌ Patch coverage is 27.27273% with 32 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/algorithms/contractions/ctmrg_contractions.jl 27.27% 32 Missing ⚠️
Files with missing lines Coverage Δ
src/algorithms/contractions/ctmrg_contractions.jl 50.89% <27.27%> (-4.35%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lkdvos
lkdvos previously approved these changes Aug 16, 2025
Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Any chance you have some timing results too?

Considering the planar and braiding things, happy to explain but this isn't super relevant here because they require @planar anyways so we don't really support that in PEPSKit.jl right now. In principle we could do this for the partition functions, but until someone actually needs it I'm fine with just keeping the @tensor.

(Combining planarity and efficiency would actually be kind of a nightmare: for braided categories at least we could replace the permutations with braidings and their inverses, but for non-braided ones we cannot arbitrarily choose the intermediary permutations so we're more limited in what can be done)

@ogauthe ogauthe marked this pull request as draft August 19, 2025 03:24
@VictorVanthilt VictorVanthilt added documentation Improvements or additions to documentation and removed documentation Improvements or additions to documentation labels Aug 20, 2025
@ogauthe
Copy link
Contributor Author

ogauthe commented Sep 11, 2025

I now specialize all renormalize edge contractions for partition function. The motivation is that when the partition function tensor is actually a contracted double layer quantum wavefunction, D~χ and permuting site tensor A is not cheap. With the new functions, A is never permuted within renormalize_X_edge.

The new contraction scheme may not be optimal in the limit D very small and χ very large and non-abelian symmetry, but I think this is a pretty uncommon case. In other cases, either contraction will dominate or if D is large having D or χ as the first leg would be equivalent.

I also fixed variables names in some other methods: although the contraction scheme were correct, the name employed for P_left/P_right were swapped.

Todo: some timing

@ogauthe ogauthe marked this pull request as ready for review September 11, 2025 15:28
@ogauthe
Copy link
Contributor Author

ogauthe commented Sep 11, 2025

I did the benchmarks in 3 cases:

benchmark code
using TensorOperations: @tensor
using TensorKit
using TensorKit: ×
using PEPSKit
using PEPSKit: @autoopt, CTMRGCornerTensor, CTMRG_PF_EdgeTensor, EnlargedCorner, PFTensor, dtmap!!, eachcoordinate, leading_boundary, select_algorithm, simultaneous_projectors
using BenchmarkTools

# ====================  master  =======================================================
function enlarge_northwest_corner_autoopt(
    E_west::CTMRG_PF_EdgeTensor, C_northwest::CTMRGCornerTensor,
    E_north::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @autoopt @tensor corner[χ_S D_S; χ_E D_E] :=
        E_west[χ_S D1; χ1] * C_northwest[χ1; χ2] * E_north[χ2 D2; χ_E] * A[D1 D_S; D2 D_E]
end

function enlarge_northeast_corner_autoopt(E_north::CTMRG_PF_EdgeTensor, C_northeast::CTMRGCornerTensor,
    E_east::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @autoopt @tensor corner[χ_W D_W; χ_S D_S] :=
        E_north[χ_W D1; χ1] * C_northeast[χ1; χ2] * E_east[χ2 D2; χ_S] * A[D_W D_S; D1 D2]
end

function enlarge_southeast_corner_autoopt(
    E_east::CTMRG_PF_EdgeTensor, C_southeast::CTMRGCornerTensor,
    E_south::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @autoopt @tensor corner[χ_N D_N; χ_W D_W] :=
        E_east[χ_N D1; χ1] * C_southeast[χ1; χ2] * E_south[χ2 D2; χ_W] * A[D_W D2; D_N D1]
end

function enlarge_southwest_corner_autoopt(
    E_south::CTMRG_PF_EdgeTensor, C_southwest::CTMRGCornerTensor,
    E_west::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @autoopt @tensor corner[χ_E D_E; χ_N D_N] :=
        E_south[χ_E D1; χ1] * C_southwest[χ1; χ2] * E_west[χ2 D2; χ_N] * A[D2 D1; D_N D_E]
end



function renormalize_north_edge_rotate(E_north, P_right, P_left, A)
    A_west = PEPSKit._rotl90_localsandwich(A)
    return renormalize_west_edge_autoopt(E_north, P_right, P_left, A_west)
end

function renormalize_east_edge_rotate(E_east, P_bottom, P_top, A)
    A_west = PEPSKit._rot180_localsandwich(A)
    return renormalize_west_edge_autoopt(E_east, P_bottom, P_top, A_west)
end

function renormalize_south_edge_rotate(E_south, P_left, P_right, A)
    A_west = PEPSKit._rotr90_localsandwich(A)
    return renormalize_west_edge_autoopt(E_south, P_left, P_right, A_west)
end

function renormalize_west_edge_autoopt(E_west::CTMRG_PF_EdgeTensor, P_top, P_bottom, A::PFTensor)
    return @autoopt @tensor edge[χ_S D_E; χ_N] :=
        E_west[χ1 D1; χ2] * A[D1 D5; D3 D_E] * P_top[χ2 D3; χ_N] * P_bottom[χ_S; χ1 D5]
end

# mixed
function renormalize_north_edge_rotate_explicit(E_north, P_right, P_left, A)
    A_west = PEPSKit._rotl90_localsandwich(A)
    return renormalize_west_edge_explicit(E_north, P_right, P_left, A_west)
end

function renormalize_east_edge_rotate_explicit(E_east, P_bottom, P_top, A)
    A_west = PEPSKit._rot180_localsandwich(A)
    return renormalize_west_edge_explicit(E_east, P_bottom, P_top, A_west)
end
function renormalize_south_edge_rotate_explicit(E_south, P_left, P_right, A)
    A_west = PEPSKit._rotr90_localsandwich(A)
    return renormalize_west_edge_explicit(E_south, P_left, P_right, A_west)
end

# ====================  explicit  =======================================================
function enlarge_northwest_corner_explicit(
    E_west::CTMRG_PF_EdgeTensor, C_northwest::CTMRGCornerTensor,
    E_north::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @tensor begin
        EC[χ_S DW; χ2] := E_west[χ_S DW; χ1] * C_northwest[χ1; χ2]
        ECE[χ_S χ_E; DW DN] := EC[χ_S DW; χ2] * E_north[χ2 DN; χ_E]
        corner[χ_S D_S; χ_E D_E] := ECE[χ_S χ_E; DW DN] * A[DW D_S; DN D_E]
    end
end

function enlarge_northeast_corner_explicit(
    E_north::CTMRG_PF_EdgeTensor, C_northeast::CTMRGCornerTensor,
    E_east::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @tensor begin
        EC[χ_W DN; χ2] := E_north[χ_W DN; χ1] * C_northeast[χ1; χ2]
        ECE[χ_W χ_S; DN DE] := EC[χ_W DN; χ2] * E_east[χ2 DE; χ_S]
        corner[χ_W D_W; χ_S D_S] := ECE[χ_W χ_S; DN DE] * A[D_W D_S; DN DE]
    end
end

function enlarge_northeast_corner_explicit_NE(
    E_north::CTMRG_PF_EdgeTensor, C_northeast::CTMRGCornerTensor,
    E_east::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @tensor begin
        EC[DN χ_W; χ2] := E_north[χ_W DN; χ1] * C_northeast[χ1; χ2]
        ECE[DN DE; χ_S χ_W] := EC[DN χ_W; χ2] * E_east[χ2 DE; χ_S]
        corner[χ_W D_W; χ_S D_S] :=  A[D_W D_S; DN DE] * ECE[DN DE; χ_S χ_W]
    end
end

function enlarge_southeast_corner_explicit(
    E_east::CTMRG_PF_EdgeTensor, C_southeast::CTMRGCornerTensor,
    E_south::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @tensor begin
        EC[χ_N D1; χ2] := E_east[χ_N D1; χ1] * C_southeast[χ1; χ2]
        ECE[χ_N χ_W; D1 D2] := EC[χ_N D1; χ2] * E_south[χ2 D2; χ_W]
        corner[χ_N D_N; χ_W D_W] := ECE[χ_N χ_W; D1 D2] * A[D_W D2; D_N D1]
    end
end

function enlarge_southwest_corner_explicit(
    E_south::CTMRG_PF_EdgeTensor, C_southwest::CTMRGCornerTensor,
    E_west::CTMRG_PF_EdgeTensor, A::PFTensor,
)
    return @tensor begin
        EC[χ_E D1; χ2] := E_south[χ_E D1; χ1] * C_southwest[χ1; χ2]
        ECE[χ_E χ_N; D2 D1] := EC[χ_E D1; χ2] * E_west[χ2 D2; χ_N]
        corner[χ_E D_E; χ_N D_N] := ECE[χ_E χ_N; D2 D1] * A[D2 D1; D_N D_E]
    end
end


function renormalize_north_edge_explicit(E_north::CTMRG_PF_EdgeTensor, P_right, P_left, A::PFTensor)
    return @tensor begin
        temp = permute(E_north, ((2, 1), (3,))) # impose D_N as 1st leg
        PE[D_N D_E; χNW χ_E] := temp[D_N χNW; χNE] * P_right[χNE D_E; χ_E]
        PEA[D_W χNW; D_S χ_E] := A[D_W D_S; D_N D_E] * PE[D_N D_E; χNW χ_E]
        P_leftp = permute(P_left, ((1,), (3, 2)))
        edge[χ_W D_S; χ_E] := P_leftp[χ_W; D_W χNW] * PEA[D_W χNW; D_S χ_E]
    end
end

function renormalize_east_edge_explicit(E_east::CTMRG_PF_EdgeTensor, P_bottom, P_top, A::PFTensor)
    return @tensor begin
        temp = permute(P_top, ((3, 1), (2,)))  # impose D_N as 1st leg
        PE[D_N D_E; χN χSE] := temp[D_N χN; χNE] * E_east[χNE D_E; χSE]
        PEA[D_W χN; χSE D_S] := A[D_W D_S; D_N D_E] * PE[D_N D_E; χN χSE]
        edge[χ_N D_W; χ_S] := PEA[D_W χ_N; χSE D_S] * P_bottom[χSE D_S; χ_S]
    end
end

function renormalize_south_edge_explicit(E_south::CTMRG_PF_EdgeTensor, P_left, P_right, A::PFTensor)
    # specialize to avoid extra permute on A when calling renormalize_west_edge
    return @tensor begin
        P_leftp = permute(P_left, ((3, 2), (1,)))  # impose χ_W as 1st leg
        PE[χ_W χSE; D_W D_S] := P_leftp[χ_W D_W; χSW] * E_south[χSE D_S; χSW]
        PEA[χ_W D_N; χSE D_E] := PE[χ_W χSE; D_W D_S] * A[D_W D_S; D_N D_E]
        edge[χ_E D_N; χ_W] := PEA[χ_W D_N; χSE D_E] * P_right[χ_E; χSE D_E]
    end
end

function renormalize_west_edge_explicit(E_west::CTMRG_PF_EdgeTensor, P_top, P_bottom, A::PFTensor)
    return @tensor begin
        PE[χ_S χNW; D_W D_S] := P_bottom[χ_S; χSW D_S] * E_west[χSW D_W; χNW]
        PEA[χ_S D_E; χNW D_N] := PE[χ_S χNW; D_W D_S] * A[D_W D_S; D_N D_E]
        edge[χ_S D_E; χ_N] := PEA[χ_S D_E; χNW D_N] * P_top[χNW D_N; χ_N]
    end
end


# ============================================================================================


function get_projectors(env, Z)
    alg = select_algorithm(leading_boundary, env; projector_alg=:fullinfinite)
    network = InfiniteSquareNetwork(Z)
    coordinates = eachcoordinate(network, 1:4)
    T_corners = Base.promote_op(
        TensorMap  EnlargedCorner, typeof(network), typeof(env), eltype(coordinates)
    )
    enlarged_corners′ = similar(coordinates, T_corners)
    enlarged_corners::typeof(enlarged_corners′) =
        dtmap!!(enlarged_corners′, eachcoordinate(network, 1:4)) do idx
            return TensorMap(EnlargedCorner(network, env, idx))
        end  # expand environment
    projectors, info = simultaneous_projectors(enlarged_corners, env, alg.projector_alg)  # compute projectors on all coordinates
    return projectors
end

Ising partition function, Trivial sector, D=2, χ=50.

  • enlarge_XXX_corner: explicit contraction scheme is always better than @autoopt
  • enlarge_northeast_corner: permuting A better than not
  • explicit renormalize_west_edge_explicit is slighthly faster than @autoopt
  • the other directions are slightly slower than calling rotate + renormalize_west
Ising specific code
A_ising, _, _ = classical_ising(; beta=0.6)
Z_ising = InfinitePartitionFunction(A_ising)


χenv =^50
env0 = CTMRGEnv(Z, χenv)
env_ising, = leading_boundary(env0, Z; alg=:simultaneous, maxiter=20, projector_alg=:fullinfinite)
projectors_ising = get_projectors(env_ising, Z_ising)


E_north_ising, E_east_ising, E_south_ising, E_west_ising = env_ising.edges[:, 1, 1]
C_northwest_ising, C_northeast_ising, C_southeast_ising, C_southwest_ising = env_ising.corners[:, 1, 1]
benchmark Ising D=2
"""
julia> @benchmark enlarge_northwest_corner_autoopt(E_west_ising, C_northwest_ising, E_north_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  209.346 μs …   8.101 ms  ┊ GC (min … max):  0.00% … 96.25%
 Time  (median):     234.376 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   269.505 μs ± 301.047 μs  ┊ GC (mean ± σ):  11.44% ±  9.63%

  █▂    ▁                                                       ▁
  ██▄▇▄▁█▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▆ █
  209 μs        Histogram: log(frequency) by time       2.77 ms <

 Memory estimate: 705.90 KiB, allocs estimate: 61.

julia> @benchmark enlarge_northwest_corner_explicit(E_west_ising, C_northwest_ising, E_north_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  193.406 μs …   7.852 ms  ┊ GC (min … max):  0.00% … 96.01%
 Time  (median):     213.049 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   247.898 μs ± 300.844 μs  ┊ GC (mean ± σ):  12.10% ±  9.49%

  █▁    ▂                                                       ▁
  ██▇▇▃▄█▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆ █
  193 μs        Histogram: log(frequency) by time       2.77 ms <

 Memory estimate: 706.21 KiB, allocs estimate: 65.

julia> @benchmark enlarge_northeast_corner_autoopt(E_north_ising, C_northeast_ising, E_east_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  220.104 μs …   5.634 ms  ┊ GC (min … max): 0.00% … 93.90%
 Time  (median):     236.588 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   264.014 μs ± 272.544 μs  ┊ GC (mean ± σ):  9.65% ±  8.62%

           ▂ ▆ █ ▆
  ▂▂▂▂▃▃▄▅▄█▅█▆█▅█▆▇▆▄▅▃▄▂▃▂▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▁▂▂▁▁▁▁▂▁▂▂ ▃
  220 μs           Histogram: frequency by time          294 μs <

 Memory estimate: 705.46 KiB, allocs estimate: 56.

julia> @benchmark enlarge_northeast_corner_explicit(E_north_ising, C_northeast_ising, E_east_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  202.621 μs …   5.957 ms  ┊ GC (min … max):  0.00% … 94.61%
 Time  (median):     216.515 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   244.139 μs ± 275.941 μs  ┊ GC (mean ± σ):  10.39% ±  8.60%

      ▁▆█▅▁
  ▂▃▄▆██████▆▅▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▁▂▁▂▂▁▁▂▂▂▁▂▂▂▁▁▂▁▂▂▂▁▁▁▁▁▂▂ ▃
  203 μs           Histogram: frequency by time          324 μs <

 Memory estimate: 705.90 KiB, allocs estimate: 61.

julia> @benchmark enlarge_northeast_corner_explicit_NE(E_north_ising, C_northeast_ising, E_east_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  257.989 μs …   5.827 ms  ┊ GC (min … max):  0.00% … 93.48%
 Time  (median):     279.117 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   312.305 μs ± 305.405 μs  ┊ GC (mean ± σ):  10.04% ±  9.35%

  █▂                                                            ▁
  ██▅▆▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆ █
  258 μs        Histogram: log(frequency) by time          3 ms <

 Memory estimate: 784.05 KiB, allocs estimate: 67.

julia> @benchmark enlarge_southeast_corner_autoopt(E_east_ising, C_southeast_ising, E_south_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  226.324 μs …   8.514 ms  ┊ GC (min … max):  0.00% … 96.13%
 Time  (median):     246.882 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   282.713 μs ± 304.425 μs  ┊ GC (mean ± σ):  11.19% ±  9.71%

  █▁    ▂                                                       ▁
  ██▅▃▃▁█▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▆ █
  226 μs        Histogram: log(frequency) by time       2.79 ms <

 Memory estimate: 705.90 KiB, allocs estimate: 61.

julia> @benchmark enlarge_southeast_corner_explicit(E_east_ising, C_southeast_ising, E_south_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  189.667 μs …   8.140 ms  ┊ GC (min … max):  0.00% … 96.29%
 Time  (median):     213.889 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   248.873 μs ± 305.923 μs  ┊ GC (mean ± σ):  12.24% ±  9.54%

  █▂    ▂                                                       ▁
  ██▄▃▄▁█▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆ █
  190 μs        Histogram: log(frequency) by time       2.76 ms <

 Memory estimate: 706.52 KiB, allocs estimate: 69.

julia> @benchmark enlarge_southwest_corner_autoopt(E_south_ising, C_southwest_ising, E_west_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  225.358 μs …   8.261 ms  ┊ GC (min … max):  0.00% … 96.26%
 Time  (median):     245.846 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   281.233 μs ± 304.465 μs  ┊ GC (mean ± σ):  11.21% ±  9.69%

  █▁    ▂                                                       ▁
  ██▃▃▃▃█▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆ █
  225 μs        Histogram: log(frequency) by time        2.8 ms <

 Memory estimate: 705.90 KiB, allocs estimate: 61.

julia> @benchmark enlarge_southwest_corner_explicit(E_south_ising, C_southwest_ising, E_west_ising, A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  186.922 μs …   6.817 ms  ┊ GC (min … max):  0.00% … 95.75%
 Time  (median):     208.119 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   242.452 μs ± 297.941 μs  ┊ GC (mean ± σ):  12.30% ±  9.50%

  █▁    ▂                                                       ▁
  ██▃▃▃▃█▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆ █
  187 μs        Histogram: log(frequency) by time       2.76 ms <

 Memory estimate: 705.46 KiB, allocs estimate: 56.

#
# explicit renormalize_west_edge_explicit is slighthly faster than @autoopt
# the other are slower than calling rotate + renormalize_west
#

julia> @benchmark renormalize_north_edge_rotate(E_north_ising, projectors_ising[1][1, 1, 1], projectors_ising[2][1, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  247.698 μs …   6.997 ms  ┊ GC (min … max):  0.00% … 94.64%
 Time  (median):     267.623 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   309.301 μs ± 340.862 μs  ┊ GC (mean ± σ):  11.38% ±  9.58%

  █▂   ▂                                                        ▁
  ██▁▄▁█▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▆ █
  248 μs        Histogram: log(frequency) by time       3.07 ms <

 Memory estimate: 706.45 KiB, allocs estimate: 68.

julia> @benchmark renormalize_north_edge_explicit(E_north_ising, projectors_ising[1][1, 1, 1], projectors_ising[2][1, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  334.574 μs …   7.253 ms  ┊ GC (min … max):  0.00% … 93.79%
 Time  (median):     351.892 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   405.273 μs ± 383.753 μs  ┊ GC (mean ± σ):  11.36% ± 10.72%

  █▁    ▂                                                       ▁
  ██▃▁▄▁█▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆▆▅▆ █
  335 μs        Histogram: log(frequency) by time       3.33 ms <

 Memory estimate: 862.21 KiB, allocs estimate: 70.


 julia> @benchmark renormalize_north_edge_rotate_explicit(E_north_ising, projectors_ising[1][1, 1, 1], projectors_ising[2][1, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  244.771 μs …   6.452 ms  ┊ GC (min … max):  0.00% … 94.21%
 Time  (median):     260.623 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   305.785 μs ± 352.318 μs  ┊ GC (mean ± σ):  12.50% ± 10.03%

  █    ▁                                                        ▁
  ██▄▃▄██▅▅▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▆▆ █
  245 μs        Histogram: log(frequency) by time       3.18 ms <

 Memory estimate: 784.64 KiB, allocs estimate: 73.



julia> @benchmark renormalize_east_edge_rotate(E_east_ising, projectors_ising[1][2, 1, 1], projectors_ising[2][2, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  250.680 μs …   7.480 ms  ┊ GC (min … max):  0.00% … 95.14%
 Time  (median):     269.076 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   310.202 μs ± 337.006 μs  ┊ GC (mean ± σ):  11.22% ±  9.57%

  █▂   ▂                                                        ▁
  ██▃▄▁█▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▆ █
  251 μs        Histogram: log(frequency) by time       3.06 ms <

 Memory estimate: 706.45 KiB, allocs estimate: 68.

julia> @benchmark renormalize_east_edge_explicit(E_east_ising, projectors_ising[1][2, 1, 1], projectors_ising[2][2, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  332.842 μs …   7.848 ms  ┊ GC (min … max):  0.00% … 94.07%
 Time  (median):     348.757 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   403.554 μs ± 386.715 μs  ┊ GC (mean ± σ):  11.45% ± 10.72%

  █▂    ▁                                                       ▁
  ██▄▃▃▃█▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆▅▆▆▆ █
  333 μs        Histogram: log(frequency) by time       3.34 ms <

 Memory estimate: 862.48 KiB, allocs estimate: 75.

 julia> @benchmark renormalize_east_edge_rotate_explicit(E_east_ising, projectors_ising[1][2, 1, 1], projectors_ising[2][2, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  240.570 μs …   6.813 ms  ┊ GC (min … max):  0.00% … 93.73%
 Time  (median):     263.562 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   308.746 μs ± 358.399 μs  ┊ GC (mean ± σ):  12.52% ±  9.98%

  █▃   ▁                                                        ▁
  ██▄▃▃██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▄▅▆ █
  241 μs        Histogram: log(frequency) by time       3.21 ms <

 Memory estimate: 784.33 KiB, allocs estimate: 69.

julia> @benchmark renormalize_south_edge_rotate(E_south_ising, projectors_ising[1][3, 1, 1], projectors_ising[2][3, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  256.781 μs …   8.372 ms  ┊ GC (min … max):  0.00% … 95.19%
 Time  (median):     273.659 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   316.367 μs ± 358.941 μs  ┊ GC (mean ± σ):  11.62% ±  9.53%

  █   ▂                                                         ▁
  █▇▁▃█▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▄▄▅ █
  257 μs        Histogram: log(frequency) by time       3.34 ms <

 Memory estimate: 706.45 KiB, allocs estimate: 68.

julia> @benchmark renormalize_south_edge_explicit(E_south_ising, projectors_ising[1][3, 1, 1], projectors_ising[2][3, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  298.483 μs …   6.602 ms  ┊ GC (min … max):  0.00% … 93.83%
 Time  (median):     323.327 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   382.574 μs ± 410.074 μs  ┊ GC (mean ± σ):  13.60% ± 11.42%

  █▄                                                            ▁
  ██▁▃▁▃██▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▅▆▆▇▆ █
  298 μs        Histogram: log(frequency) by time       3.34 ms <

 Memory estimate: 1019.37 KiB, allocs estimate: 89.

 julia> @benchmark renormalize_south_edge_rotate_explicit(E_south_ising, projectors_ising[1][3, 1, 1], projectors_ising[2][3, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  247.671 μs …   6.634 ms  ┊ GC (min … max):  0.00% … 94.38%
 Time  (median):     268.044 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   313.979 μs ± 367.780 μs  ┊ GC (mean ± σ):  12.70% ± 10.01%

  █    ▂                                                        ▁
  ██▁▃▃█▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▃▄▃▄▄▄▄▄▄ █
  248 μs        Histogram: log(frequency) by time       3.33 ms <

 Memory estimate: 784.64 KiB, allocs estimate: 73.

julia> @benchmark renormalize_west_edge_autoopt(E_west_ising, projectors_ising[1][4, 1, 1], projectors_ising[2][4, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  248.072 μs …   8.750 ms  ┊ GC (min … max):  0.00% … 95.84%
 Time  (median):     267.580 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   308.741 μs ± 340.750 μs  ┊ GC (mean ± σ):  11.37% ±  9.58%

  █▁   ▂                                                        ▁
  ██▄▄▁█▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▆ █
  248 μs        Histogram: log(frequency) by time       3.09 ms <

 Memory estimate: 706.01 KiB, allocs estimate: 63.

julia> @benchmark renormalize_west_edge_explicit(E_west_ising, projectors_ising[1][4, 1, 1], projectors_ising[2][4, 1, 1], A_ising)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  241.214 μs …   7.053 ms  ┊ GC (min … max):  0.00% … 95.11%
 Time  (median):     257.874 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   302.953 μs ± 354.252 μs  ┊ GC (mean ± σ):  12.62% ± 10.01%

  █▁   ▁                                                        ▁
  ██▄▃▃██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▅▆ █
  241 μs        Histogram: log(frequency) by time       3.14 ms <

 Memory estimate: 783.89 KiB, allocs estimate: 64.

bilayer quantum tensor from finite temperature, with D2=121,χ=121

spaceD = Rep[ℤ₂ × SU₂]((0, 0)=>9, (1, 0)=>4, (0, 1)=>9, (1, 1)=>12, (0, 2)=>6, (1, 2)=>3)'
spaceχ = Rep[ℤ₂ × SU₂]((0, 0)=>8, (1, 0)=>4, (0, 1)=>9, (1, 1)=>11, (0, 2)=>4, (1, 2)=>4, (1, 3)=>1)
  • enlarge_XXX_corner: @autoopt and explicit scheme are similar
  • except for northeast, where @autoopt = explicit_NE is quite faster
  • renormalize_XXX_edge: explicit scheme much faster
D=11 specific code
# D = 11
spaceD = Rep[ℤ₂ × SU₂]((0, 0)=>9, (1, 0)=>4, (0, 1)=>9, (1, 1)=>12, (0, 2)=>6, (1, 2)=>3)'
spaceχ = Rep[ℤ₂ × SU₂]((0, 0)=>8, (1, 0)=>4, (0, 1)=>9, (1, 1)=>11, (0, 2)=>4, (1, 2)=>4, (1, 3)=>1)

A_z2su2 = randn(spaceD  spaceD  spaceD  spaceD)
B_z2su2 = randn(spaceD  spaceD  spaceD  spaceD)
C_z2su2 = randn(spaceD  spaceD  spaceD  spaceD)
D_z2su2 = randn(spaceD  spaceD  spaceD  spaceD)
Z_z2su2 = InfinitePartitionFunction([A_z2su2 B_z2su2; C_z2su2 D_z2su2])

env0 = CTMRGEnv(Z_z2su2, spaceχ)
env_z2su2, = leading_boundary(env0, Z_z2su2; alg=:simultaneous, maxiter=2, projector_alg=:fullinfinite)
projectors_z2su2 = get_projectors(env_z2su2, Z_z2su2)


E_north_z2su2, E_east_z2su2, E_south_z2su2, E_west_z2su2 = env_z2su2.edges[:, 1, 1]
C_northwest_z2su2, C_northeast_z2su2, C_southeast_z2su2, C_southwest_z2su2 = env_z2su2.corners[:, 1, 1]
benchmark D2=121
# Z2xSU2 D = 11
# enlarge_XXX_corner: @autoopt and explicit are similar
# except for NE, where @autoopt gives explicit_NE, quite faster
"""
julia> @benchmark enlarge_northwest_corner_autoopt(E_west_z2su2, C_northwest_z2su2, E_north_z2su2, A_z2su2)
BenchmarkTools.Trial: 77 samples with 1 evaluation per sample.
 Range (min … max):  61.427 ms … 69.770 ms  ┊ GC (min … max): 0.00% … 9.46%
 Time  (median):     65.769 ms              ┊ GC (median):    5.16%
 Time  (mean ± σ):   65.388 ms ±  2.206 ms  ┊ GC (mean ± σ):  5.45% ± 3.64%

   ▂▂▅                        ▂▅█ ▂ ▅  █ ▂▂▂
  ████▅█▁▁▅█▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▅▅█████████▁█████▅█▅▅▁▁▅▅▁▁█▁▁▅▁▁█ ▁
  61.4 ms         Histogram: frequency by time        69.4 ms <

 Memory estimate: 74.45 MiB, allocs estimate: 197.

julia> @benchmark enlarge_northwest_corner_explicit(E_west_z2su2, C_northwest_z2su2, E_north_z2su2, A_z2su2)
BenchmarkTools.Trial: 76 samples with 1 evaluation per sample.
 Range (min … max):  62.003 ms … 70.532 ms  ┊ GC (min … max): 0.00% … 9.28%
 Time  (median):     66.585 ms              ┊ GC (median):    5.04%
 Time  (mean ± σ):   66.116 ms ±  2.037 ms  ┊ GC (mean ± σ):  5.17% ± 3.68%

                                █▂▂     ▅
  ▄▁▁▅▇▇▄▁▇▁▄▇▁▄▁▅▁▁▄▁▁▁▁▁▁▁▄▅▄▁███▇▅▇▄▅█▇█▄▄▁▁▄▁▁▄▁▁▁▁▁▁▄▁▁▄ ▁
  62 ms           Histogram: frequency by time        70.5 ms <

 Memory estimate: 74.45 MiB, allocs estimate: 211.

julia> @benchmark enlarge_northeast_corner_autoopt(E_north_z2su2, C_northeast_z2su2, E_east_z2su2, A_z2su2)
BenchmarkTools.Trial: 94 samples with 1 evaluation per sample.
 Range (min … max):  50.309 ms … 57.861 ms  ┊ GC (min … max): 0.00% … 12.24%
 Time  (median):     54.436 ms              ┊ GC (median):    6.51%
 Time  (mean ± σ):   53.579 ms ±  1.902 ms  ┊ GC (mean ± σ):  4.53% ±  3.17%

            ▂                        ▄ ▂ ▆▂▄ █  ▄
  ██▆█▄▄▆██▁█▄▆▆▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆█▆█▆████████▆▆▄▆▁▁▄▁▆▁▁▄ ▁
  50.3 ms         Histogram: frequency by time        56.5 ms <

 Memory estimate: 57.95 MiB, allocs estimate: 194.

julia> @benchmark enlarge_northeast_corner_explicit(E_north_z2su2, C_northeast_z2su2, E_east_z2su2, A_z2su2)
BenchmarkTools.Trial: 78 samples with 1 evaluation per sample.
 Range (min … max):  60.268 ms … 68.362 ms  ┊ GC (min … max): 0.00% … 5.30%
 Time  (median):     64.368 ms              ┊ GC (median):    5.53%
 Time  (mean ± σ):   64.101 ms ±  2.271 ms  ┊ GC (mean ± σ):  5.06% ± 3.72%

  ▆   █                       ▆ █▁▁▃▃             ▆▁▃
  █▄▇▁█▇▄▁▄▁▁▁▁▄▄▄▁▁▄▁▁▁▁▁▄▁▄▇█▇█████▇▇▁▄▄▄▁▁▁▁▁▄▁███▁▄▁▁▁▇▁▄ ▁
  60.3 ms         Histogram: frequency by time        68.1 ms <

 Memory estimate: 72.75 MiB, allocs estimate: 211.

 julia> @benchmark enlarge_northeast_corner_explicit_NE(E_north_z2su2, C_northeast_z2su2, E_east_z2su2, A_z2su2)
BenchmarkTools.Trial: 94 samples with 1 evaluation per sample.
 Range (min … max):  50.275 ms … 57.916 ms  ┊ GC (min … max): 0.00% … 11.75%
 Time  (median):     53.789 ms              ┊ GC (median):    6.60%
 Time  (mean ± σ):   53.543 ms ±  2.025 ms  ┊ GC (mean ± σ):  5.70% ±  4.33%

                               ██▃                    ▂
  ▇▇▃▃▅▅▇▁▃▇▁▃▃▇▃▁▁▁▁▁▁▁▁▁▁▁▁▁▆███▅▆▅▃▆▃▃▁▁▁▁▁▃▁▁▁▁▁▁▇█▃▇▁▁▃▃ ▁
  50.3 ms         Histogram: frequency by time        57.1 ms <

 Memory estimate: 58.14 MiB, allocs estimate: 242.

julia> @benchmark enlarge_southeast_corner_autoopt(E_east_z2su2, C_southeast_z2su2, E_south_z2su2, A_z2su2)
BenchmarkTools.Trial: 75 samples with 1 evaluation per sample.
 Range (min … max):  62.922 ms … 73.666 ms  ┊ GC (min … max): 0.00% … 9.28%
 Time  (median):     67.283 ms              ┊ GC (median):    5.07%
 Time  (mean ± σ):   66.977 ms ±  2.342 ms  ┊ GC (mean ± σ):  5.11% ± 3.65%

   ▂  ▂  ▄                █▂▂   ▆  ▆
  ▆█▆▁█▁▁█▁▆▄▁▁▁▁▁▁▄▁▁▄▄▆▆███▄█▆█▆▄█▆▁▄█▁▁▁▁▄▁▄▁▄▁▄▁▁▁▁▁▁▁▁▁▄ ▁
  62.9 ms         Histogram: frequency by time        72.9 ms <

 Memory estimate: 74.45 MiB, allocs estimate: 207.

julia> @benchmark enlarge_southeast_corner_explicit(E_east_z2su2, C_southeast_z2su2, E_south_z2su2, A_z2su2)
BenchmarkTools.Trial: 75 samples with 1 evaluation per sample.
 Range (min … max):  62.347 ms … 74.354 ms  ┊ GC (min … max): 0.00% … 9.26%
 Time  (median):     66.947 ms              ┊ GC (median):    5.07%
 Time  (mean ± σ):   66.710 ms ±  2.318 ms  ┊ GC (mean ± σ):  5.11% ± 3.65%

                              ▅       █
  ▄▅▅▇▇▁▁▁▁▄▄▁▅▅▁▄▁▁▁▄▁▁▁▁▅▅▄██▇▅█▄▇▄▄█▇▅▇▄▁▁▄▁▄▁▁▁▄▁▄▁▁▁▁▄▁▄ ▁
  62.3 ms         Histogram: frequency by time        71.5 ms <

 Memory estimate: 74.45 MiB, allocs estimate: 211.

julia> @benchmark enlarge_southwest_corner_autoopt(E_south_z2su2, C_southwest_z2su2, E_west_z2su2, A_z2su2)
BenchmarkTools.Trial: 79 samples with 1 evaluation per sample.
 Range (min … max):  60.290 ms … 69.281 ms  ┊ GC (min … max): 0.00% … 9.89%
 Time  (median):     63.984 ms              ┊ GC (median):    5.32%
 Time  (mean ± σ):   63.831 ms ±  2.274 ms  ┊ GC (mean ± σ):  4.85% ± 3.46%

  ▂                         █▂
  ██▅▆▁▁▅▁▃▁▃▁▁▃▃▁▁▁▁▁▁▁▁▅▃▇███▆▃▆▅▃▃▁▁▁▃▁▃▁▃▃▅▃▁▃▃▁▅▃▁▁▃▅▁▁▃ ▁
  60.3 ms         Histogram: frequency by time        68.4 ms <

 Memory estimate: 72.75 MiB, allocs estimate: 205.

julia> @benchmark enlarge_southwest_corner_explicit(E_south_z2su2, C_southwest_z2su2, E_west_z2su2, A_z2su2)
BenchmarkTools.Trial: 93 samples with 1 evaluation per sample.
 Range (min … max):  50.399 ms … 57.117 ms  ┊ GC (min … max): 0.00% … 11.42%
 Time  (median):     54.141 ms              ┊ GC (median):    6.26%
 Time  (mean ± σ):   53.853 ms ±  1.989 ms  ┊ GC (mean ± σ):  5.45% ±  4.08%

                                  ▆█                      ▄
  ▄▅▄▃▃▅▃▅▄▁▁▅▃▃▃▃▁▄▃▃▁▁▁▁▁▁▁▁▁▃▃▄████▁▁▃▃▄▃▃▁▁▁▁▁▁▁▁▁▁▁▁▅█▅▄ ▁
  50.4 ms         Histogram: frequency by time          57 ms <

 Memory estimate: 57.95 MiB, allocs estimate: 194.
#
# Z2xSU2 D = 11
# renormalize_XXX_edge: explicit scheme much faster
#
julia> @benchmark renormalize_north_edge_rotate(E_north_z2su2, projectors_z2su2[1][1, 1, 1], projectors_z2su2[2][1, 1, 1], A_z2su2)
BenchmarkTools.Trial: 64 samples with 1 evaluation per sample.
 Range (min … max):  74.734 ms … 82.965 ms  ┊ GC (min … max): 0.00% … 8.10%
 Time  (median):     78.901 ms              ┊ GC (median):    4.27%
 Time  (mean ± σ):   79.152 ms ±  2.290 ms  ┊ GC (mean ± σ):  4.67% ± 3.16%

                               █                   ▂▂
  ▄█▄▁▄▁▅▁▁▁▁▅▁▁▄▁▁▄▁▄▁▄▁▄▁▁▇▁▇██▄▅▁▁█▁▁▁▁▁▁▁▁▁▁▅▄███▄▄▁▁▁▁▄▄ ▁
  74.7 ms         Histogram: frequency by time        82.9 ms <

 Memory estimate: 91.13 MiB, allocs estimate: 264.

julia> @benchmark renormalize_north_edge_explicit(E_north_z2su2, projectors_z2su2[1][1, 1, 1], projectors_z2su2[2][1, 1, 1], A_z2su2)
BenchmarkTools.Trial: 91 samples with 1 evaluation per sample.
 Range (min … max):  52.184 ms … 60.411 ms  ┊ GC (min … max): 0.00% … 15.43%
 Time  (median):     55.829 ms              ┊ GC (median):    5.95%
 Time  (mean ± σ):   55.424 ms ±  1.818 ms  ┊ GC (mean ± σ):  5.23% ±  4.06%

                                  █   ▄
  ▃▆▇▅▃▃▃▆▁▅▃▆▅▁▁▅▁▅▃▁▁▁▁▁▁▁▁▁▁▆▇▅█▆▇▆█▅▅▅▁▁▁▁▁▃▅▇▆▇▅▁▁▁▁▁▁▁▃ ▁
  52.2 ms         Histogram: frequency by time        58.8 ms <

 Memory estimate: 58.52 MiB, allocs estimate: 282.

julia> @benchmark renormalize_north_edge_rotate_explicit(E_north_z2su2, projectors_z2su2[1][1, 1, 1], projectors_z2su2[2][1, 1, 1], A_z2su2)
BenchmarkTools.Trial: 75 samples with 1 evaluation per sample.
 Range (min … max):  63.539 ms … 71.630 ms  ┊ GC (min … max): 0.00% … 9.00%
 Time  (median):     67.409 ms              ┊ GC (median):    4.96%
 Time  (mean ± σ):   67.373 ms ±  2.160 ms  ┊ GC (mean ± σ):  4.76% ± 3.44%

    ▂   ▂                     ▂▅█▂▅ ▅                    ▅
  ▅▁█▅█▅██▅▁██▁▁▁▁█▅▁▁▁▁▁▁▁▁█▅█████▅██▅▅▅▁▁▅██▅▅▁▁▅▁▁▁██▅█▁██ ▁
  63.5 ms         Histogram: frequency by time        70.9 ms <

 Memory estimate: 74.64 MiB, allocs estimate: 220.

julia> @benchmark renormalize_east_edge_rotate(E_east_z2su2, projectors_z2su2[1][2, 1, 1], projectors_z2su2[2][2, 1, 1], A_z2su2)
BenchmarkTools.Trial: 62 samples with 1 evaluation per sample.
 Range (min … max):  76.963 ms … 86.096 ms  ┊ GC (min … max): 0.00% … 4.07%
 Time  (median):     83.121 ms              ┊ GC (median):    7.93%
 Time  (mean ± σ):   81.617 ms ±  2.504 ms  ┊ GC (mean ± σ):  5.45% ± 3.43%

                                                    ██▄ ▃
  ▇▄▄▄▆▄▁▁▁▆▄▆▁▁▁▁▄▁▁▁▁▁▁▁▄▆▆▁▄▄▄▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▄▁▆███▇█▆▁▄▄ ▁
  77 ms           Histogram: frequency by time        84.2 ms <

 Memory estimate: 91.13 MiB, allocs estimate: 264.

julia> @benchmark renormalize_east_edge_explicit(E_east_z2su2, projectors_z2su2[1][2, 1, 1], projectors_z2su2[2][2, 1, 1], A_z2su2)
BenchmarkTools.Trial: 91 samples with 1 evaluation per sample.
 Range (min … max):  51.763 ms … 58.892 ms  ┊ GC (min … max): 0.00% … 11.11%
 Time  (median):     55.416 ms              ┊ GC (median):    6.04%
 Time  (mean ± σ):   55.046 ms ±  1.774 ms  ┊ GC (mean ± σ):  5.23% ±  4.02%

    ▆     ▄                        ▆  ▆ █▆            ▄ ▆
  ▄▄█▄▄▄▄▁█▁▄▆▆▁▆▁▄▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁█████▆██▄██▄▁▁▁▄▄▄▁▄███▄▁▁▄ ▁
  51.8 ms         Histogram: frequency by time        57.7 ms <

 Memory estimate: 58.33 MiB, allocs estimate: 265.

julia> @benchmark renormalize_east_edge_rotate_explicit(E_east_z2su2, projectors_z2su2[1][2, 1, 1], projectors_z2su2[2][2, 1, 1], A_z2su2)
BenchmarkTools.Trial: 72 samples with 1 evaluation per sample.
 Range (min … max):  66.029 ms … 74.751 ms  ┊ GC (min … max): 0.00% … 8.92%
 Time  (median):     70.200 ms              ┊ GC (median):    4.78%
 Time  (mean ± σ):   70.195 ms ±  2.180 ms  ┊ GC (mean ± σ):  4.55% ± 3.29%

     ▁    ▄  ▁▄            █▁▁█▄▁  ▄ █▄  ▁ ▁ ▁▁ ▁
  ▆▆▁█▁▆▆▆█▆▆██▆▁▁▆▆▁▁▁▁▆▆▆██████▁▆█▆██▁▁█▁█▁██▆█▆▆▆▆▆▆▁▁▆▁▁▆ ▁
  66 ms           Histogram: frequency by time        74.5 ms <

 Memory estimate: 74.64 MiB, allocs estimate: 218.

julia> @benchmark renormalize_south_edge_rotate(E_south_z2su2, projectors_z2su2[1][3, 1, 1], projectors_z2su2[2][3, 1, 1], A_z2su2)
BenchmarkTools.Trial: 63 samples with 1 evaluation per sample.
 Range (min … max):  75.586 ms … 83.626 ms  ┊ GC (min … max): 0.00% … 7.91%
 Time  (median):     81.926 ms              ┊ GC (median):    8.08%
 Time  (mean ± σ):   80.228 ms ±  2.580 ms  ┊ GC (mean ± σ):  5.59% ± 3.49%

                                                ▃█▃
  ▅▃▆▁▃▁▅▄▁▃▁▁▃▁▁▁▁▁▁▁▁▁▃▃▁▁▁▅▃▃▃▁▁▁▃▁▁▁▁▁▁▁▁▁▃▃███▆▃▁▁▁▁▁▁▁▃ ▁
  75.6 ms         Histogram: frequency by time        83.6 ms <

 Memory estimate: 91.13 MiB, allocs estimate: 260.

julia> @benchmark renormalize_south_edge_explicit(E_south_z2su2, projectors_z2su2[1][3, 1, 1], projectors_z2su2[2][3, 1, 1], A_z2su2)
BenchmarkTools.Trial: 90 samples with 1 evaluation per sample.
 Range (min … max):  52.635 ms … 59.056 ms  ┊ GC (min … max): 0.00% … 11.50%
 Time  (median):     56.449 ms              ┊ GC (median):    5.89%
 Time  (mean ± σ):   56.162 ms ±  1.731 ms  ┊ GC (mean ± σ):  5.21% ±  3.97%

          ▁▁                     █  ▃▁▁▃ ▃▁  ▁     ▃   ▃
  ▄▄▄▁▇▁▄▁██▇▇▄▇▁▄▄▇▄▁▁▄▁▄▄▁▁▁▁▄▄█▇▇████▄██▄▇█▄▇▁▁▄█▇▇▁█▄▇▄▁▇ ▁
  52.6 ms         Histogram: frequency by time        58.9 ms <

 Memory estimate: 58.52 MiB, allocs estimate: 282.

julia> @benchmark renormalize_south_edge_rotate_explicit(E_south_z2su2, projectors_z2su2[1][3, 1, 1], projectors_z2su2[2][3, 1, 1], A_z2su2)
BenchmarkTools.Trial: 74 samples with 1 evaluation per sample.
 Range (min … max):  64.614 ms … 71.895 ms  ┊ GC (min … max): 0.00% … 9.09%
 Time  (median):     68.119 ms              ┊ GC (median):    4.93%
 Time  (mean ± σ):   68.277 ms ±  2.003 ms  ┊ GC (mean ± σ):  4.70% ± 3.43%

       ▂                   █ ▂ ▅  ▂▅   ▂               ▂ ▂
  ▅▁▅▅▅█▁██▁█▅▁█▁▅▁▅▁▅▁▁▅▅████▅█▅███▅▁▁█▁▁▅▁▁█▁▅█▁▁▁▁▅████▁██ ▁
  64.6 ms         Histogram: frequency by time        71.6 ms <

 Memory estimate: 74.64 MiB, allocs estimate: 226.

julia> @benchmark renormalize_west_edge_autoopt(E_west_z2su2, projectors_z2su2[1][4, 1, 1], projectors_z2su2[2][4, 1, 1], A_z2su2)
BenchmarkTools.Trial: 75 samples with 1 evaluation per sample.
 Range (min … max):  63.522 ms … 71.659 ms  ┊ GC (min … max): 0.00% … 9.22%
 Time  (median):     67.321 ms              ┊ GC (median):    5.04%
 Time  (mean ± σ):   67.263 ms ±  2.247 ms  ┊ GC (mean ± σ):  5.13% ± 3.65%

                           ▄▆               █▄
  ███▆▄▄▄▁▄▄▆▆▁▄▁▁▁▁▁▁▁▄▁▁▁███▆▆█▄▆█▄▁▄▁▁▁▁▄███▁▄▁▁▁▄▁▁▄▄▁▆▁▄ ▁
  63.5 ms         Histogram: frequency by time        71.4 ms <

 Memory estimate: 74.64 MiB, allocs estimate: 259.

julia> @benchmark renormalize_west_edge_explicit(E_west_z2su2, projectors_z2su2[1][4, 1, 1], projectors_z2su2[2][4, 1, 1], A_z2su2)
BenchmarkTools.Trial: 89 samples with 1 evaluation per sample.
 Range (min … max):  52.455 ms … 61.024 ms  ┊ GC (min … max): 0.00% … 10.52%
 Time  (median):     56.303 ms              ┊ GC (median):    5.95%
 Time  (mean ± σ):   56.222 ms ±  1.950 ms  ┊ GC (mean ± σ):  5.09% ±  3.98%

       ▄                  █▂ ▄▂▂ ▂       ▂
  ▄▆▁▄▆█▆▁▄▁▄▆▆▄▁▁▄▄▄▁▁▄▆▆██▆███▆█▆▆▄▄▄█▄█▄█▄█▁▁▁▄▄▁▁▁▁▁▄▄▁▁▄ ▁
  52.5 ms         Histogram: frequency by time        60.7 ms <

 Memory estimate: 58.14 MiB, allocs estimate: 213.

bilayer quantum tensor from finite temperature, with D2=256,χ=256

spaceD = Rep[ℤ₂ × SU₂]((0, 0)=>10, (1, 0)=>4, (0, 1)=>12, (1, 1)=>16, (0, 2)=>12, (1, 2)=>8, (0, 3)=>3, (1, 3)=>4, (0, 4)=>1)
spaceχ = Rep[ℤ₂ × SU₂]((0, 0)=>10, (1, 0)=>4, (0, 1)=>12, (1, 1)=>15, (0, 2)=>13, (1, 2)=>9, (0, 3)=>2, (1, 3)=>4, (0, 4)=>1)
  • same story as D=121
  • enlarge_XXX_corner: @autoopt and explicit scheme are similar
  • except for northeast, where @autoopt = explicit_NE is quite faster
  • renormalize_XXX_edge: explicit scheme much faster
D2=256 specific code
spaceD = Rep[ℤ₂ × SU₂]((0, 0)=>10, (1, 0)=>4, (0, 1)=>12, (1, 1)=>16, (0, 2)=>12, (1, 2)=>8, (0, 3)=>3, (1, 3)=>4, (0, 4)=>1)
spaceχ = Rep[ℤ₂ × SU₂]((0, 0)=>10, (1, 0)=>4, (0, 1)=>12, (1, 1)=>15, (0, 2)=>13, (1, 2)=>9, (0, 3)=>2, (1, 3)=>4, (0, 4)=>1)

A_z2su2 = randn(spaceD  spaceD  spaceD  spaceD)
B_z2su2 = randn(spaceD  spaceD  spaceD  spaceD)
C_z2su2 = randn(spaceD  spaceD  spaceD  spaceD)
D_z2su2 = randn(spaceD  spaceD  spaceD  spaceD)
Z_z2su2 = InfinitePartitionFunction([A_z2su2 B_z2su2; C_z2su2 D_z2su2])

env0 = CTMRGEnv(Z_z2su2, spaceχ)
env_z2su2, = leading_boundary(env0, Z_z2su2; alg=:simultaneous, maxiter=2, projector_alg=:fullinfinite)
projectors_z2su2 = get_projectors(env_z2su2, Z_z2su2)


E_north_z2su2, E_east_z2su2, E_south_z2su2, E_west_z2su2 = env_z2su2.edges[:, 1, 1]
C_northwest_z2su2, C_northeast_z2su2, C_southeast_z2su2, C_southwest_z2su2 = env_z2su2.corners[:, 1, 1]
benchmark D2=256
julia> @benchmark enlarge_northwest_corner_autoopt(E_west_z2su2, C_northwest_z2su2, E_north_z2su2, A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.008 s     1.713 s  ┊ GC (min  max):  0.00%  41.47%
 Time  (median):     1.017 s               ┊ GC (median):     1.03%
 Time  (mean ± σ):   1.154 s ± 312.518 ms  ┊ GC (mean ± σ):  12.90% ± 18.18%

  █
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆ ▁
  1.01 s         Histogram: frequency by time         1.71 s <

 Memory estimate: 690.74 MiB, allocs estimate: 248.

julia> @benchmark enlarge_northwest_corner_explicit(E_west_z2su2, C_northwest_z2su2, E_north_z2su2, A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.011 s     1.738 s  ┊ GC (min  max):  0.00%  41.63%
 Time  (median):     1.022 s               ┊ GC (median):     1.04%
 Time  (mean ± σ):   1.163 s ± 321.145 ms  ┊ GC (mean ± σ):  12.98% ± 18.28%

  █
  █▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  1.01 s         Histogram: frequency by time         1.74 s <

 Memory estimate: 690.74 MiB, allocs estimate: 254.

julia> @benchmark enlarge_northeast_corner_autoopt(E_north_z2su2, C_northeast_z2su2, E_east_z2su2, A_z2su2)
BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min  max):  883.822 ms  899.711 ms  ┊ GC (min  max): 0.00%  1.70%
 Time  (median):     890.642 ms               ┊ GC (median):    0.70%
 Time  (mean ± σ):   891.422 ms ±   5.177 ms  ┊ GC (mean ± σ):  0.81% ± 0.56%

  █                       ███          █                      █
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁███▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  884 ms           Histogram: frequency by time          900 ms <

 Memory estimate: 552.81 MiB, allocs estimate: 243.

julia> @benchmark enlarge_northeast_corner_explicit(E_north_z2su2, C_northeast_z2su2, E_east_z2su2, A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.004 s     1.760 s  ┊ GC (min  max):  0.00%  42.29%
 Time  (median):     1.026 s               ┊ GC (median):     1.03%
 Time  (mean ± σ):   1.165 s ± 332.541 ms  ┊ GC (mean ± σ):  13.24% ± 18.62%

  ██                                                       ▁
  ██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1 s            Histogram: frequency by time         1.76 s <

 Memory estimate: 690.74 MiB, allocs estimate: 254.

julia> @benchmark enlarge_northeast_corner_explicit_NE(E_north_z2su2, C_northeast_z2su2, E_east_z2su2, A_z2su2)
BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min  max):  885.249 ms  903.390 ms  ┊ GC (min  max): 0.00%  1.71%
 Time  (median):     891.860 ms               ┊ GC (median):    0.71%
 Time  (mean ± σ):   893.366 ms ±   6.062 ms  ┊ GC (mean ± σ):  0.81% ± 0.57%

  ▁                   ▁ █              ▁                      ▁
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  885 ms           Histogram: frequency by time          903 ms <

 Memory estimate: 553.66 MiB, allocs estimate: 291.

julia> @benchmark enlarge_southeast_corner_autoopt(E_east_z2su2, C_southeast_z2su2, E_south_z2su2, A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  998.571 ms     1.734 s  ┊ GC (min  max):  0.00%  41.51%
 Time  (median):        1.015 s               ┊ GC (median):     1.02%
 Time  (mean ± σ):      1.159 s ± 322.092 ms  ┊ GC (mean ± σ):  13.05% ± 18.19%

  ▁█▁                                                         ▁
  ███▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  999 ms           Histogram: frequency by time          1.73 s <

 Memory estimate: 690.74 MiB, allocs estimate: 246.

julia> @benchmark enlarge_southeast_corner_explicit(E_east_z2su2, C_southeast_z2su2, E_south_z2su2, A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.006 s     1.719 s  ┊ GC (min  max):  0.00%  41.48%
 Time  (median):     1.017 s               ┊ GC (median):     0.98%
 Time  (mean ± σ):   1.155 s ± 315.221 ms  ┊ GC (mean ± σ):  12.81% ± 18.26%

  █
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆ ▁
  1.01 s         Histogram: frequency by time         1.72 s <

 Memory estimate: 690.74 MiB, allocs estimate: 250.

julia> @benchmark enlarge_southwest_corner_autoopt(E_south_z2su2, C_southwest_z2su2, E_west_z2su2, A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.002 s     1.708 s  ┊ GC (min  max):  0.00%  41.58%
 Time  (median):     1.008 s               ┊ GC (median):     0.99%
 Time  (mean ± σ):   1.148 s ± 313.217 ms  ┊ GC (mean ± σ):  12.84% ± 18.30%

  █
  █▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  1 s            Histogram: frequency by time         1.71 s <

 Memory estimate: 690.74 MiB, allocs estimate: 238.

julia> @benchmark enlarge_southwest_corner_explicit(E_south_z2su2, C_southwest_z2su2, E_west_z2su2, A_z2su2)
BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min  max):  889.816 ms  909.898 ms  ┊ GC (min  max): 0.00%  1.69%
 Time  (median):     900.272 ms               ┊ GC (median):    0.69%
 Time  (mean ± σ):   901.398 ms ±   7.574 ms  ┊ GC (mean ± σ):  0.79% ± 0.56%

  ▁                         ▁ ▁     ▁                         █
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁█▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  890 ms           Histogram: frequency by time          910 ms <

 Memory estimate: 552.81 MiB, allocs estimate: 237.
#
#
#
julia> @benchmark renormalize_north_edge_rotate(E_north_z2su2, projectors_z2su2[1][1, 1, 1], projectors_z2su2[2][1, 1, 1], A_z2su2)
BenchmarkTools.Trial: 4 samples with 1 evaluation per sample.
 Range (min  max):  1.146 s     1.861 s  ┊ GC (min  max):  0.00%  38.34%
 Time  (median):     1.158 s               ┊ GC (median):     0.93%
 Time  (mean ± σ):   1.331 s ± 353.742 ms  ┊ GC (mean ± σ):  13.81% ± 18.87%

  █▁                                                       ▁
  ██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.15 s         Histogram: frequency by time         1.86 s <

 Memory estimate: 829.52 MiB, allocs estimate: 331.

julia> @benchmark renormalize_north_edge_explicit(E_north_z2su2, projectors_z2su2[1][1, 1, 1], projectors_z2su2[2][1, 1, 1], A_z2su2)
BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min  max):  906.581 ms    1.004 s  ┊ GC (min  max): 0.00%  9.46%
 Time  (median):     915.537 ms              ┊ GC (median):    0.84%
 Time  (mean ± σ):   928.849 ms ± 37.239 ms  ┊ GC (mean ± σ):  2.25% ± 3.61%

  ▁  ▁ █ ▁                                                   ▁
  █▁▁█▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  907 ms          Histogram: frequency by time             1 s <

 Memory estimate: 555.38 MiB, allocs estimate: 343.

julia> @benchmark renormalize_north_edge_rotate_explicit(E_north_z2su2, projectors_z2su2[1][1, 1, 1], projectors_z2su2[2][1, 1, 1], A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.034 s     1.744 s  ┊ GC (min  max):  0.00%  40.93%
 Time  (median):     1.043 s               ┊ GC (median):     1.01%
 Time  (mean ± σ):   1.182 s ± 314.027 ms  ┊ GC (mean ± σ):  12.59% ± 17.99%

  █
  █▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  1.03 s         Histogram: frequency by time         1.74 s <

 Memory estimate: 691.59 MiB, allocs estimate: 269.

julia> @benchmark renormalize_east_edge_rotate(E_east_z2su2, projectors_z2su2[1][2, 1, 1], projectors_z2su2[2][2, 1, 1], A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.162 s     1.878 s  ┊ GC (min  max):  0.38%  37.89%
 Time  (median):     1.172 s               ┊ GC (median):     0.83%
 Time  (mean ± σ):   1.310 s ± 317.677 ms  ┊ GC (mean ± σ):  11.29% ± 16.68%

  █
  █▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  1.16 s         Histogram: frequency by time         1.88 s <

 Memory estimate: 829.52 MiB, allocs estimate: 323.

julia> @benchmark renormalize_east_edge_explicit(E_east_z2su2, projectors_z2su2[1][2, 1, 1], projectors_z2su2[2][2, 1, 1], A_z2su2)
BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min  max):  904.080 ms  995.931 ms  ┊ GC (min  max): 0.00%  9.27%
 Time  (median):     911.708 ms               ┊ GC (median):    0.85%
 Time  (mean ± σ):   924.682 ms ±  35.068 ms  ┊ GC (mean ± σ):  2.21% ± 3.53%

  ▁   █▁▁                                                     ▁
  █▁▁▁███▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  904 ms           Histogram: frequency by time          996 ms <

 Memory estimate: 554.52 MiB, allocs estimate: 322.

julia> @benchmark renormalize_east_edge_rotate_explicit(E_east_z2su2, projectors_z2su2[1][2, 1, 1], projectors_z2su2[2][2, 1, 1], A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.050 s    1.158 s  ┊ GC (min  max): 0.00%  9.16%
 Time  (median):     1.063 s              ┊ GC (median):    0.94%
 Time  (mean ± σ):   1.079 s ± 44.247 ms  ┊ GC (mean ± σ):  2.46% ± 3.84%

  ▁    ▁█                                                 ▁
  █▁▁▁▁██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.05 s         Histogram: frequency by time        1.16 s <

 Memory estimate: 691.59 MiB, allocs estimate: 267.

julia> @benchmark renormalize_south_edge_rotate(E_south_z2su2, projectors_z2su2[1][3, 1, 1], projectors_z2su2[2][3, 1, 1], A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.153 s     1.859 s  ┊ GC (min  max):  0.00%  38.13%
 Time  (median):     1.167 s               ┊ GC (median):     0.81%
 Time  (mean ± σ):   1.303 s ± 310.732 ms  ┊ GC (mean ± σ):  11.30% ± 16.80%

  ██                                                       ▁
  ██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.15 s         Histogram: frequency by time         1.86 s <

 Memory estimate: 829.52 MiB, allocs estimate: 313.

julia> @benchmark renormalize_south_edge_explicit(E_south_z2su2, projectors_z2su2[1][3, 1, 1], projectors_z2su2[2][3, 1, 1], A_z2su2)
BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min  max):  905.743 ms    1.000 s  ┊ GC (min  max): 0.00%  9.39%
 Time  (median):     913.875 ms              ┊ GC (median):    0.84%
 Time  (mean ± σ):   929.156 ms ± 35.679 ms  ┊ GC (mean ± σ):  2.22% ± 3.58%

  █  ███         █                                           █
  █▁▁███▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  906 ms          Histogram: frequency by time             1 s <

 Memory estimate: 555.38 MiB, allocs estimate: 339.

julia> @benchmark renormalize_south_edge_rotate_explicit(E_south_z2su2, projectors_z2su2[1][3, 1, 1], projectors_z2su2[2][3, 1, 1], A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.033 s     1.782 s  ┊ GC (min  max):  0.00%  40.58%
 Time  (median):     1.044 s               ┊ GC (median):     1.01%
 Time  (mean ± σ):   1.189 s ± 331.084 ms  ┊ GC (mean ± σ):  12.67% ± 17.83%

  █
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆ ▁
  1.03 s         Histogram: frequency by time         1.78 s <

 Memory estimate: 691.59 MiB, allocs estimate: 267.

julia> @benchmark renormalize_west_edge_autoopt(E_west_z2su2, projectors_z2su2[1][4, 1, 1], projectors_z2su2[2][4, 1, 1], A_z2su2)
BenchmarkTools.Trial: 5 samples with 1 evaluation per sample.
 Range (min  max):  1.027 s   1.045 s  ┊ GC (min  max): 0.00%  1.78%
 Time  (median):     1.039 s             ┊ GC (median):    0.98%
 Time  (mean ± σ):   1.036 s ± 7.299 ms  ┊ GC (mean ± σ):  0.91% ± 0.66%

  █        █                           █  █              █
  █▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.03 s        Histogram: frequency by time        1.04 s <

 Memory estimate: 691.59 MiB, allocs estimate: 314.

julia> @benchmark renormalize_west_edge_explicit(E_west_z2su2, projectors_z2su2[1][4, 1, 1], projectors_z2su2[2][4, 1, 1], A_z2su2)
BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min  max):  904.715 ms  997.710 ms  ┊ GC (min  max): 0.00%  9.31%
 Time  (median):     912.063 ms               ┊ GC (median):    0.84%
 Time  (mean ± σ):   925.313 ms ±  35.627 ms  ┊ GC (mean ± σ):  2.21% ± 3.55%

  ▁  ▁█ ▁                                                     ▁
  █▁▁██▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  905 ms           Histogram: frequency by time          998 ms <

 Memory estimate: 553.66 MiB, allocs estimate: 256.

@lkdvos
Copy link
Member

lkdvos commented Sep 15, 2025

Thanks a lot for the detailed benchmark! I do have to admit that the results are somewhat surprising to me. Am I reading that wrong or are there actual regressions by making this change too? I think it is indeed expected that the speedup isn't uniform over all the directions, but I would have guessed that we should be able to have an improvement overall, and seemingly right now it is only sometimes true. I realize also that the cases vary quite wildly, since D scales differently depending on whether or not it is a squashed quantum case or a partition function, but I expected them both to result in the same outcome.

In particular, what surprised me is some of the choices of putting a D leg first to keep a permutation contiguous, rather than a chi leg. While I agree that D might become as large as chi in your typical usecases, I think in all regimes we still expect D < chi, so that might be slightly suboptimal? In a similar line of thinking, permuting A should be less costly than permuting two edges that are connected, so maybe the focus might be not entirely fair. (correct me if I'm wrong though, this is really something I would have to measure and profile to see which of the steps is actually where the time is spent)

@ogauthe
Copy link
Contributor Author

ogauthe commented Sep 15, 2025

These results are a bit confusing. Here is a try to make things more clear, the smaller the best.

Ising D=11 D=16
enlarge NW explicit < autoopt = =
enlarge NE explicit < autoopt < explicitNE explicitNE = autoopt < explicit explicitNE = autoopt < explicit
enlarge SE explicit < autoopt = =
enlarge SW explicit < autoopt explicit < autoopt explicit < autoopt

Here are the edge renormalization. Since for direction west, explicit is always better than @autoopt, rotate_explicit is always better than rotate.

Ising D=11 D=16
renormalize N rotate < explicit explicit < rotate explicit < rotate
renormalize E rotate < explicit explicit < rotate explicit < rotate
renormalize S rotate < explicit explicit < rotate explicit < rotate
renormalize W explicit < autoopt explicit < autoopt explicit < autoopt

Hence the questions are

  • what to do about corner north east
  • what to do for renormalize N-E-S

But I think these benchmark show that this PR improves corners NW-SE-SW and renornalize west

@ogauthe
Copy link
Contributor Author

ogauthe commented Sep 16, 2025

More benchmarks: I considered tensors as found in https://arxiv.org/abs/2505.05889 with D=16 and χ=100. This looks more relevant that Ising square lattice with D=2 that was too simple.

Kagome Ising D=16 code
A_kagome = randn(ℂ^16 ^16 ^16 ^16)

Z_kagome = InfinitePartitionFunction(A_kagome)
χ_kagome =^100
env0 = CTMRGEnv(Z_kagome, χ_kagome)
env_kagome, = leading_boundary(env0, Z_kagome; alg=:simultaneous, maxiter=20, projector_alg=:fullinfinite)
projectors_kagome = get_projectors(env_kagome, Z_kagome)

E_north_kagome, E_east_kagome, E_south_kagome, E_west_kagome = env_kagome.edges[:, 1, 1]
C_northwest_kagome, C_northeast_kagome, C_southeast_kagome, C_southwest_kagome = env_kagome.corners[:, 1, 1]
enlarge corner benchmark
julia> @benchmark enlarge_northwest_corner_autoopt(E_west_kagome, C_northwest_kagome, E_north_kagome, A_kagome)
BenchmarkTools.Trial: 107 samples with 1 evaluation per sample.
 Range (min  max):  42.886 ms  53.565 ms  ┊ GC (min  max): 0.00%  15.96%
 Time  (median):     47.162 ms              ┊ GC (median):    9.43%
 Time  (mean ± σ):   46.745 ms ±  2.163 ms  ┊ GC (mean ± σ):  6.87% ±  4.48%

                         █    ▆▂                               
  ▅█▆▆▆▁▅▃▃▃▁▁▁▁▃▃▁▁▁▁▃▇▆█▆▆▇▅██▄▅▆▆▆▃▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃ ▃
  42.9 ms         Histogram: frequency by time        53.3 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 62.

julia> @benchmark enlarge_northwest_corner_explicit(E_west_kagome, C_northwest_kagome, E_north_kagome, A_kagome)
BenchmarkTools.Trial: 134 samples with 1 evaluation per sample.
 Range (min  max):  33.823 ms  42.002 ms  ┊ GC (min  max):  0.00%  11.31%
 Time  (median):     38.438 ms              ┊ GC (median):    11.78%
 Time  (mean ± σ):   37.491 ms ±  1.946 ms  ┊ GC (mean ± σ):   8.95% ±  5.19%

  ▁▄▂▂                                       ▆██▄              
  ████▅▁▅▅▅▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▅▅▁▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁█████▆▆█▅▅▁▁▁▅▁▅ ▅
  33.8 ms      Histogram: log(frequency) by time      39.9 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 62.

julia> 

julia> @benchmark enlarge_northeast_corner_autoopt(E_north_kagome, C_northeast_kagome, E_east_kagome, A_kagome)
BenchmarkTools.Trial: 108 samples with 1 evaluation per sample.
 Range (min  max):  42.761 ms  60.854 ms  ┊ GC (min  max): 0.00%  25.63%
 Time  (median):     47.327 ms              ┊ GC (median):    9.67%
 Time  (mean ± σ):   46.664 ms ±  2.455 ms  ┊ GC (mean ± σ):  7.18% ±  4.79%

      ▃                           ▃ ▇▇  ▃  █                   
  ▆▃▆▅██▇▅▁▁▁▁▁▅▁▁▁▃▁▁▃▁▁▁▁▁▁▁▁▁▁▁█▆██▇▆█▇██▇▁▆▃▁▁▁▅▁▁▁▁▁▁▁▃▃ ▃
  42.8 ms         Histogram: frequency by time        50.4 ms <

 Memory estimate: 79.35 MiB, allocs estimate: 56.

julia> @benchmark enlarge_northeast_corner_explicit(E_north_kagome, C_northeast_kagome, E_east_kagome, A_kagome)
BenchmarkTools.Trial: 133 samples with 1 evaluation per sample.
 Range (min  max):  34.070 ms  40.255 ms  ┊ GC (min  max):  0.00%  11.48%
 Time  (median):     38.683 ms              ┊ GC (median):    11.75%
 Time  (mean ± σ):   37.692 ms ±  1.981 ms  ┊ GC (mean ± σ):   8.93% ±  5.20%

                                              ▃█▅              
  ▅▅█▄▃▃▃▁▃▃▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▃███▆▆▄▃▃▃▁▃▃▃▃▃ ▃
  34.1 ms         Histogram: frequency by time          40 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 62.

julia> @benchmark enlarge_northeast_corner_explicit_NE(E_north_kagome, C_northeast_kagome, E_east_kagome, A_kagome)
BenchmarkTools.Trial: 109 samples with 1 evaluation per sample.
 Range (min  max):  42.475 ms  48.382 ms  ┊ GC (min  max): 0.00%  9.79%
 Time  (median):     46.905 ms              ┊ GC (median):    9.64%
 Time  (mean ± σ):   45.980 ms ±  1.802 ms  ┊ GC (mean ± σ):  7.00% ± 4.43%

                                                   █▆▃         
  ▄▄▆▃▄▅▄▄▁▄▃▁▃▁▁▁▁▃▁▁▁▁▃▁▁▃▁▁▁▁▁▁▁▃▁▁▃▁▁▁▁▁▁▁▁▁▃▄▇███▅▇▄▄▄▄▄ ▃
  42.5 ms         Histogram: frequency by time        47.7 ms <

 Memory estimate: 80.57 MiB, allocs estimate: 67.

julia> 

julia> 

julia> @benchmark enlarge_southeast_corner_autoopt(E_east_kagome, C_southeast_kagome, E_south_kagome, A_kagome)
BenchmarkTools.Trial: 107 samples with 1 evaluation per sample.
 Range (min  max):  42.998 ms  54.184 ms  ┊ GC (min  max): 0.00%  16.20%
 Time  (median):     47.820 ms              ┊ GC (median):    9.46%
 Time  (mean ± σ):   46.842 ms ±  2.101 ms  ┊ GC (mean ± σ):  7.15% ±  4.31%

                                          ▁▂█▁                 
  ▃▅▃▇▅▅▃▃▁▃▁▁▁▁▁▁▁▁▁▃▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅████▆▅▃▃▁▁▁▁▁▁▁▁▁▁▃ ▃
  43 ms           Histogram: frequency by time        49.9 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 62.

julia> @benchmark enlarge_southeast_corner_explicit(E_east_kagome, C_southeast_kagome, E_south_kagome, A_kagome)
BenchmarkTools.Trial: 134 samples with 1 evaluation per sample.
 Range (min  max):  33.557 ms  41.648 ms  ┊ GC (min  max):  0.00%  11.41%
 Time  (median):     38.479 ms              ┊ GC (median):    12.37%
 Time  (mean ± σ):   37.537 ms ±  2.099 ms  ┊ GC (mean ± σ):   9.47% ±  5.50%

                                       ▂█▇                     
  ▅▅▅▆▅▁▃▃▁▁▁▃▁▃▁▁▁▁▁▁▁▃▁▁▁▁▁▁▃▁▁▁▁▁▁▃▅███▆▆▃▃▁▃▃▃▁▄▃▃▁▁▁▁▁▁▃ ▃
  33.6 ms         Histogram: frequency by time        41.1 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 62.

julia> 

julia> @benchmark enlarge_southwest_corner_autoopt(E_south_kagome, C_southwest_kagome, E_west_kagome, A_kagome)
BenchmarkTools.Trial: 108 samples with 1 evaluation per sample.
 Range (min  max):  42.717 ms  49.468 ms  ┊ GC (min  max): 0.00%  9.29%
 Time  (median):     47.813 ms              ┊ GC (median):    9.46%
 Time  (mean ± σ):   46.698 ms ±  2.134 ms  ┊ GC (mean ± σ):  6.85% ± 4.34%

                                                █▃             
  ▃▁▄▄▅▄▇▇▄▃▃▃▁▁▁▁▃▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃████▇▆▅▄▃▅▄▁▃▃▃ ▃
  42.7 ms         Histogram: frequency by time        49.2 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 62.

julia> @benchmark enlarge_southwest_corner_explicit(E_south_kagome, C_southwest_kagome, E_west_kagome, A_kagome)
BenchmarkTools.Trial: 135 samples with 1 evaluation per sample.
 Range (min  max):  33.540 ms  39.497 ms  ┊ GC (min  max):  0.00%  11.53%
 Time  (median):     38.313 ms              ┊ GC (median):    11.75%
 Time  (mean ± σ):   37.282 ms ±  2.019 ms  ┊ GC (mean ± σ):   8.89% ±  5.22%

                                                 ▁█▃▂          
  ▆▅█▄▃▃▁▁▃▁▁▁▁▃▃▁▃▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▃▁▁▁▁▁▁████▆▇▃▄▄▃▃▃ ▃
  33.5 ms         Histogram: frequency by time        39.3 ms <

 Memory estimate: 79.35 MiB, allocs estimate: 56.
renormalize edge benchmark
julia> @benchmark renormalize_north_edge_autoopt(E_north_kagome, projectors_kagome[1][1, 1, 1], projectors_kagome[2][1, 1, 1], A_kagome)
BenchmarkTools.Trial: 89 samples with 1 evaluation per sample.
 Range (min  max):  52.544 ms  63.477 ms  ┊ GC (min  max): 0.00%  13.68%
 Time  (median):     57.143 ms              ┊ GC (median):    7.78%
 Time  (mean ± σ):   56.386 ms ±  2.180 ms  ┊ GC (mean ± σ):  5.69% ±  3.65%

    ▂                                    ▆▂▅    ▃▂█            
  ▅▅█▅█▄▅▅▁▁▅▁▁▄▄▁▁▁▁▁▁▁▁▁▄▁▄▁▁▁▁▁▁▁▁▁▁▁▁████▅▇▇███▅▅▄▄▁▁▁▄▅▄ ▁
  52.5 ms         Histogram: frequency by time        59.1 ms <

 Memory estimate: 79.35 MiB, allocs estimate: 58.

julia> @benchmark renormalize_north_edge_rotate(E_north_kagome, projectors_kagome[1][1, 1, 1], projectors_kagome[2][1, 1, 1], A_kagome)
BenchmarkTools.Trial: 88 samples with 1 evaluation per sample.
 Range (min  max):  52.712 ms  61.252 ms  ┊ GC (min  max): 0.00%  7.17%
 Time  (median):     58.150 ms              ┊ GC (median):    7.57%
 Time  (mean ± σ):   56.877 ms ±  2.415 ms  ┊ GC (mean ± σ):  5.28% ± 3.64%

                                            ▆▂█▃               
  ▄▃▄▅▆▆▅▃▄▃▄▃▁▃▁▁▁▁▁▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▄▃▆████▃▄▁▃▄▁▁▃▅▁▁▁▃ ▁
  52.7 ms         Histogram: frequency by time        60.3 ms <

 Memory estimate: 80.35 MiB, allocs estimate: 70.

julia> @benchmark renormalize_north_edge_explicit(E_north_kagome, projectors_kagome[1][1, 1, 1], projectors_kagome[2][1, 1, 1], A_kagome)
BenchmarkTools.Trial: 95 samples with 1 evaluation per sample.
 Range (min  max):  49.137 ms  55.720 ms  ┊ GC (min  max): 0.00%  8.08%
 Time  (median):     54.098 ms              ┊ GC (median):    8.23%
 Time  (mean ± σ):   52.891 ms ±  2.164 ms  ┊ GC (mean ± σ):  5.88% ± 3.82%

                                                ▃█▇▃▆          
  ▄▇▇▆█▃▇▃▁▁▄▁▄▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄█████▆▃▄▃▄▃▃▃ ▁
  49.1 ms         Histogram: frequency by time        55.3 ms <

 Memory estimate: 81.79 MiB, allocs estimate: 70.

julia> @benchmark renormalize_north_edge_rotate_explicit(E_north_kagome, projectors_kagome[1][1, 1, 1], projectors_kagome[2][1, 1, 1], A_kagome)
BenchmarkTools.Trial: 112 samples with 1 evaluation per sample.
 Range (min  max):  40.868 ms  47.124 ms  ┊ GC (min  max): 0.00%  9.73%
 Time  (median):     45.718 ms              ┊ GC (median):    9.89%
 Time  (mean ± σ):   44.648 ms ±  2.002 ms  ┊ GC (mean ± σ):  7.26% ± 4.49%

                                                ▂█▅▃           
  ▃▆▆▃▄▃▄▁▃▃▁▁▃▃▁▃▁▁▁▃▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▃████▃▁▅▁▁▃▃▃▃ ▃
  40.9 ms         Histogram: frequency by time        46.9 ms <

 Memory estimate: 81.07 MiB, allocs estimate: 70.

julia> 

julia> 

julia> @benchmark renormalize_east_edge_autoopt(E_east_kagome, projectors_kagome[1][2, 1, 1], projectors_kagome[2][2, 1, 1], A_kagome)
BenchmarkTools.Trial: 82 samples with 1 evaluation per sample.
 Range (min  max):  57.533 ms  66.594 ms  ┊ GC (min  max): 0.00%  12.92%
 Time  (median):     62.341 ms              ┊ GC (median):    7.14%
 Time  (mean ± σ):   61.258 ms ±  2.121 ms  ┊ GC (mean ± σ):  4.99% ±  3.49%

        ▂                                   ▂▅▆█▂ ▃▃           
  ▇▅▄▇▄▄█▄▄▇▁▄▁▁▄▁▁▄▁▄▁▁▁▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇█████▄██▄▁▄▄▁▁▁▁▄ ▁
  57.5 ms         Histogram: frequency by time          64 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 64.

julia> @benchmark renormalize_east_edge_rotate(E_east_kagome, projectors_kagome[1][2, 1, 1], projectors_kagome[2][2, 1, 1], A_kagome)
BenchmarkTools.Trial: 88 samples with 1 evaluation per sample.
 Range (min  max):  52.969 ms  61.811 ms  ┊ GC (min  max): 0.00%  12.36%
 Time  (median):     58.318 ms              ┊ GC (median):    7.52%
 Time  (mean ± σ):   57.217 ms ±  2.274 ms  ┊ GC (mean ± σ):  5.37% ±  3.65%

                                       ▃█▂                     
  ▄▃▁▆▄▃▃▄▅▄▄▇▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▃▁▆███▆▇▁▅▄▅▁▃▅▁▁▁▁▁▁▁▁▁▃ ▁
  53 ms           Histogram: frequency by time        61.2 ms <

 Memory estimate: 80.35 MiB, allocs estimate: 70.

julia> @benchmark renormalize_east_edge_explicit(E_east_kagome, projectors_kagome[1][2, 1, 1], projectors_kagome[2][2, 1, 1], A_kagome)
BenchmarkTools.Trial: 97 samples with 1 evaluation per sample.
 Range (min  max):  46.477 ms  55.295 ms  ┊ GC (min  max): 0.00%  8.10%
 Time  (median):     53.226 ms              ┊ GC (median):    8.40%
 Time  (mean ± σ):   51.701 ms ±  2.875 ms  ┊ GC (mean ± σ):  6.10% ± 3.88%

                                                  ▂▃▂█▃        
  ▄▇▅▅█▄▇█▄▄▄▄▄▁▄▁▁▁▁▇▁▁▁▁▁▁▁▁▁▁▁▁▁▄▅▄▅▄▁▅▅▅▁▁▁▅███████▄▄█▇▄▇ ▁
  46.5 ms         Histogram: frequency by time        54.9 ms <

 Memory estimate: 81.79 MiB, allocs estimate: 75.

julia> @benchmark renormalize_east_edge_rotate_explicit(E_east_kagome, projectors_kagome[1][2, 1, 1], projectors_kagome[2][2, 1, 1], A_kagome)
BenchmarkTools.Trial: 109 samples with 1 evaluation per sample.
 Range (min  max):  41.175 ms  52.484 ms  ┊ GC (min  max):  0.00%  18.13%
 Time  (median):     46.925 ms              ┊ GC (median):    10.30%
 Time  (mean ± σ):   45.899 ms ±  2.562 ms  ┊ GC (mean ± σ):   7.56% ±  4.96%

     ▂                            █▄ ▂▂▂▂                      
  ▅▅▃██▃▅▅▇▃▆▁▃▃▁▃▁▃▁▁▁▁▁▁▁▁▃▅▅█▆▅██▆████▇▇▃▅▃▁▁▁▁▁▁▃▁▁▁▁▁▁▁▃ ▃
  41.2 ms         Histogram: frequency by time        51.5 ms <

 Memory estimate: 81.07 MiB, allocs estimate: 70.

julia> 

julia> @benchmark renormalize_south_edge_autoopt(E_south_kagome, projectors_kagome[1][3, 1, 1], projectors_kagome[2][3, 1, 1], A_kagome)
BenchmarkTools.Trial: 82 samples with 1 evaluation per sample.
 Range (min  max):  57.679 ms  68.490 ms  ┊ GC (min  max): 0.00%  12.67%
 Time  (median):     62.410 ms              ┊ GC (median):    7.17%
 Time  (mean ± σ):   61.463 ms ±  2.229 ms  ┊ GC (mean ± σ):  5.08% ±  3.46%

    ▂  ▃                             ▆█▃█▆                     
  ▅▄█▄██▇▁▁▁▁▁▁▄▁▄▁▁▁▁▄▁▁▁▁▄▁▁▁▁▁▁▁▄▅█████▇▇▅▁▁▁▁▁▁▁▄▄▁▁▄▁▁▁▄ ▁
  57.7 ms         Histogram: frequency by time        65.3 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 64.

julia> @benchmark renormalize_south_edge_rotate(E_south_kagome, projectors_kagome[1][3, 1, 1], projectors_kagome[2][3, 1, 1], A_kagome)
BenchmarkTools.Trial: 86 samples with 1 evaluation per sample.
 Range (min  max):  51.854 ms  78.354 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     58.419 ms              ┊ GC (median):    8.23%
 Time  (mean ± σ):   58.266 ms ±  3.277 ms  ┊ GC (mean ± σ):  6.08% ± 4.39%

        ▁              ▇█▃▁  ▁                                 
  ▅▁▅▁▅▇█▇▇▁▁▁▁▅▁▅▁▇▇▇▅████▅▅█▅▇▅▅▁▅▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▁
  51.9 ms      Histogram: log(frequency) by time      69.3 ms <

 Memory estimate: 80.35 MiB, allocs estimate: 70.

julia> @benchmark renormalize_south_edge_explicit(E_south_kagome, projectors_kagome[1][3, 1, 1], projectors_kagome[2][3, 1, 1], A_kagome)
BenchmarkTools.Trial: 107 samples with 1 evaluation per sample.
 Range (min  max):  42.487 ms  58.384 ms  ┊ GC (min  max): 0.00%  26.47%
 Time  (median):     47.465 ms              ┊ GC (median):    9.49%
 Time  (mean ± σ):   46.956 ms ±  2.427 ms  ┊ GC (mean ± σ):  7.34% ±  4.82%

  ▁▁▂▁                  ▅█▅▂                                   
  ████▅▁▅▁▁▁▅▅▁▅▁▇▁▅▁▅▅▅████▇▅█▇█▅▇▅▁▁▅▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▅
  42.5 ms      Histogram: log(frequency) by time      55.1 ms <

 Memory estimate: 84.23 MiB, allocs estimate: 87.

julia> @benchmark renormalize_south_edge_rotate_explicit(E_south_kagome, projectors_kagome[1][3, 1, 1], projectors_kagome[2][3, 1, 1], A_kagome)
BenchmarkTools.Trial: 112 samples with 1 evaluation per sample.
 Range (min  max):  40.898 ms  47.712 ms  ┊ GC (min  max): 0.00%  9.54%
 Time  (median):     45.744 ms              ┊ GC (median):    9.85%
 Time  (mean ± σ):   44.768 ms ±  2.011 ms  ┊ GC (mean ± σ):  7.22% ± 4.47%

                                           ▁▆▇█                
  ▅▅▅▆▇▅▃▄▁▁▃▁▃▁▁▁▁▁▃▁▁▁▁▁▁▁▁▄▃▃▁▁▁▁▁▁▁▁▁▁▆████▆▆▄▆▃▃▃▃▁▁▄▁▁▃ ▃
  40.9 ms         Histogram: frequency by time        47.6 ms <

 Memory estimate: 81.07 MiB, allocs estimate: 70.

julia> 

julia> @benchmark renormalize_west_edge_autoopt(E_west_kagome, projectors_kagome[1][4, 1, 1], projectors_kagome[2][4, 1, 1], A_kagome)
BenchmarkTools.Trial: 88 samples with 1 evaluation per sample.
 Range (min  max):  52.704 ms  61.470 ms  ┊ GC (min  max): 0.00%  7.15%
 Time  (median):     58.146 ms              ┊ GC (median):    7.58%
 Time  (mean ± σ):   56.923 ms ±  2.439 ms  ┊ GC (mean ± σ):  5.32% ± 3.68%

          ▂                                ▅▇█▄                
  ▃▃▆▅▆▆█▅█▃▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▃▁▃▁▆████▆█▅▃▆▃▁▃▁▁▃▃▁▅ ▁
  52.7 ms         Histogram: frequency by time        60.3 ms <

 Memory estimate: 79.85 MiB, allocs estimate: 64.

julia> @benchmark renormalize_west_edge_explicit(E_west_kagome, projectors_kagome[1][4, 1, 1], projectors_kagome[2][4, 1, 1], A_kagome)
BenchmarkTools.Trial: 113 samples with 1 evaluation per sample.
 Range (min  max):  40.600 ms  47.968 ms  ┊ GC (min  max): 0.00%  9.56%
 Time  (median):     45.330 ms              ┊ GC (median):    9.97%
 Time  (mean ± σ):   44.291 ms ±  2.096 ms  ┊ GC (mean ± σ):  7.35% ± 4.51%

                                        ▁▆▆█▄                  
  ▅█▇▅▇▆▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▄█████▆▆▃▁▃▃▁▃▁▁▁▁▃▁▁▃ ▃
  40.6 ms         Histogram: frequency by time        47.6 ms <

 Memory estimate: 80.57 MiB, allocs estimate: 64.

Updated summary enlarger corner, the smaller the best.

Ising D=2, χ=100 Kagome D=16, χ=100 D2=121, χ=121 D2=256, χ=256
enlarge NW explicit < autoopt explicit < autoopt = =
enlarge NE explicit < autoopt < explicitNE explicit < explicitNE≲ autoopt explicitNE = autoopt < explicit explicitNE = autoopt < explicit
enlarge SE explicit < autoopt explicit < autoopt explicit < autoopt =
enlarge SW explicit < autoopt explicit < autoopt explicit < autoopt explicit < autoopt

Updated summary renormalize edge, the smaller the best. Since for direction west, explicit is always better than @autoopt, rotate_explicit is always better than rotate.

Ising D=2, χ=100 Kagome D=16, χ=100 D2=121 D2=256
renormalize N rotate < explicit rotate < explicit < autoopt explicit < rotate explicit < rotate
renormalize E rotate < explicit rotate < explicit< autoopt explicit < rotate explicit < rotate
renormalize S rotate < explicit rotate < explicit < autoopt explicit < rotate explicit < rotate
renormalize W explicit < autoopt explicit < autoopt explicit < autoopt explicit < autoopt

Going from D=2 to D=16, for corner north east autoopt and explicitNE are now close similar, with explicit NE already slightly faster. For edge renormalization, the order did not change but the gap between rotate and explicit is closing, the two are now very close. With this new data, I now favor having this PR for renormalize_edge in all cases. The case enlarge corner NE is more complicated.

@ogauthe ogauthe force-pushed the autopt_PF branch 2 times, most recently from 5192f5a to c3d7e4a Compare September 23, 2025 14:25
@ogauthe
Copy link
Contributor Author

ogauthe commented Sep 23, 2025

To summarize:

  • edge renormalization: the 2 competing implementations have close performances for vD = ℂ^16 (worst case: 40 ms vs ~49 ms). For D2=121, the explicit scheme is already much better
  • corner renormalization: corners NW, SW and SE are improved by this PR
  • corner NE is a special case with no size fits all. This PR currently uses a custom implementation optimizing the large D case. It is not always the fastest but is never too bad. It may be improved using AB = C = (C')' = (B'A')').

@lkdvos
Copy link
Member

lkdvos commented Sep 24, 2025

@leburgel it seems like something did actually go wrong with the last changes, as now the example tests seem to time out. Any idea what could be the cause?

@leburgel
Copy link
Member

@leburgel it seems like something did actually go wrong with the last changes, as now the example tests seem to time out. Any idea what could be the cause?

I actually don't think things went wrong with #261, but rather that it fixed the oversight from #246 that caused the gradient computation with the fallback linear solver to terminate long before it actually converged. This means it was just moving on with very bad gradients, which I think didn't cause any problems because this usually happens at the start of an optimization.

I had a look, and in the test that timed out it got completely stuck in the first LBFGS iteration for the variational optimization of the Heisenberg model starting from the simple update result. How stuck it gets seems to vary a lot, but in the one that timed out it was particularly bad. I think if we come up with a generic procedure to 'kick' the simple update starting guess a bit before feeding it into the variational optimization this should be solved.

I don't think there's a way out algorithmically, it seems both the eigsolve and linsolve methods for computing the fixed point gradient just have a lot of trouble converging.

@lkdvos
Copy link
Member

lkdvos commented Sep 25, 2025

Ok, in that case I'll merge this since that is unrelated.

@lkdvos lkdvos merged commit 73a5056 into QuantumKitHub:master Sep 25, 2025
49 of 51 checks passed
@ogauthe ogauthe deleted the autopt_PF branch September 25, 2025 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants