Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

RedHatAI
/

quantization

Model card Files Files and versions

533 MB

2 contributors

History: 7 commits

danieldk's picture

danieldk HF Staff

Add GPTQ-Marlin

c31b5ce about 1 year ago

build
Build about 1 year ago
compressed_tensors
Add `scaled_(int|fp8)_quant` and `fp8_marlin_gemm` about 1 year ago
core
Add `scaled_(int|fp8)_quant` and `fp8_marlin_gemm` about 1 year ago
cutlass_extensions
Add cutlass_w8a8 about 1 year ago
cutlass_w8a8
Add cutlass_w8a8 about 1 year ago
ext-torch
Add GPTQ-Marlin about 1 year ago
fp8
Add `scaled_(int|fp8)_quant` and `fp8_marlin_gemm` about 1 year ago
gptq_marlin
Add GPTQ-Marlin about 1 year ago
.gitattributes
1.56 kB

Build about 1 year ago
LICENSE
11.4 kB

Add cutlass_w8a8 about 1 year ago
README.md
181 Bytes

Fixup metadata about 1 year ago
build.toml
2.13 kB

Add GPTQ-Marlin about 1 year ago
dispatch_utils.h
1.49 kB

Add `scaled_(int|fp8)_quant` and `fp8_marlin_gemm` about 1 year ago