Fp qmv #2984

awni · 2026-01-11T17:00:40Z

Adds a basic qmv kernel for fp quants for CUDA.
Adds a simple quantize-dequantize kernel for CUDA, Metal, CPU
Routes the qqmv to the quantize-dequantize + qmv for all backends

awni · 2026-01-22T18:28:02Z

Moving out of draft.

There is a nice speedup over qqmm with cublas for the qmv case. On a Spark:

quant	GB/s pre	GB/s post
nvfp4	164.163	232.178
mxfp8	178.034	221.105

awni · 2026-01-22T18:51:25Z

I think we can optimize the fp_qmv a bit more.. but it's a good start so probably worth landing and hill-climbing.

angeloskath

It looks great and the perf seems already great... I guess the hill will be small :-)

awni force-pushed the fp_qmv branch from 7f97c39 to f527572 Compare January 12, 2026 19:58

awni force-pushed the fp_qmv branch 3 times, most recently from 0c5b9de to 2357ccd Compare January 22, 2026 17:47

awni added 6 commits January 22, 2026 09:48

add very basic fp qmv

7ccb885

working for batched

ed9e8df

use uint32

bc90123

route qqmv to qmv with qauntize-dequantize kernel

6e97fcf

cleanup

32cad2c

fix older cuda

91c381c

awni force-pushed the fp_qmv branch 3 times, most recently from 586737c to 458262b Compare January 22, 2026 18:16

cpu and metal

19db566

awni force-pushed the fp_qmv branch from 458262b to 19db566 Compare January 22, 2026 18:22

awni marked this pull request as ready for review January 22, 2026 18:28

awni requested review from angeloskath and zcbenz January 22, 2026 18:51

angeloskath approved these changes Jan 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fp qmv #2984

Fp qmv #2984

Uh oh!

awni commented Jan 11, 2026 •

edited

Loading

Uh oh!

awni commented Jan 22, 2026

Uh oh!

awni commented Jan 22, 2026

Uh oh!

angeloskath left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fp qmv #2984

Are you sure you want to change the base?

Fp qmv #2984

Uh oh!

Conversation

awni commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

awni commented Jan 22, 2026

Uh oh!

awni commented Jan 22, 2026

Uh oh!

angeloskath left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

awni commented Jan 11, 2026 •

edited

Loading