WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com

Support 1 * 128 and 128 * 128 block-wise quant? #38

Open

Assignees

Labels

opened

In the CUDA 12.9 cuBLASLt documentation, I noticed support for 1×128 and 128×128 block-wise quantization methods. However, I found that nvmath-python currently lacks bindings for this type of quantize approach. I wonder do we have any plan for support this approach?

https://docs.nvidia.com/cuda/cublas/index.html#cublasltmatmulmatrixscale-t

Metadata

Assignees

szkarpinski

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests