WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Support 1 * 128 and 128 * 128 block-wise quant? #38

@zfan2356

Description

@zfan2356

In the CUDA 12.9 cuBLASLt documentation, I noticed support for 1×128 and 128×128 block-wise quantization methods. However, I found that nvmath-python currently lacks bindings for this type of quantize approach. I wonder do we have any plan for support this approach?

https://docs.nvidia.com/cuda/cublas/index.html#cublasltmatmulmatrixscale-t

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions