vendor agnostic #5

bjarthur · 2023-04-25T20:37:51Z

fixes #1.

not merged yet because benchmarks are slower by ~10%:

the huge regression in batched_dot can partially be fixed by specifying CUDABackend(prefer_blocks=true), but this then is not vendor agnostic. see https://discourse.julialang.org/t/kernelabstractions-get-backend-keyword-arguments/97895

bjarthur · 2024-06-11T18:41:29Z

second pass at KA:

num threads hard-coded at 32 in the first (only) dimension to maximize block utilization mostly alleviates regression in bdot.

see JuliaGPU/KernelAbstractions.jl#479

bjarthur force-pushed the master branch from 163f5b4 to d5ceb0e Compare January 23, 2024 12:56

bjarthur added 5 commits May 30, 2024 17:51

test spmv and spr with lower triangle properly

49b1c65

support views

def47a5

add size checks

5e33bd8

constrain test types

2e60bde

vendor agnostic

14e787a

bjarthur force-pushed the bja/ka branch from 259892b to 14e787a Compare June 11, 2024 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vendor agnostic #5

vendor agnostic #5

Uh oh!

bjarthur commented Apr 25, 2023 •

edited

Loading

Uh oh!

bjarthur commented Jun 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vendor agnostic #5

Are you sure you want to change the base?

vendor agnostic #5

Uh oh!

Conversation

bjarthur commented Apr 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjarthur commented Jun 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bjarthur commented Apr 25, 2023 •

edited

Loading