FP8-block, FP8-dynamic, NVFP4, w4a16, w8a8 quantized models of ibm-granite/granite-4.0-h-small and ibm-granite/granite-4.0-h-tiny models
-
inference-optimization/granite-4.0-h-tiny-FP8-block
Text Generation • 7B • Updated • 57 -
inference-optimization/granite-4.0-h-tiny-FP8-dynamic
Text Generation • 7B • Updated • 57 -
inference-optimization/granite-4.0-h-tiny-quantized.w4a16
Updated • 37 -
inference-optimization/granite-4.0-h-tiny-NVFP4
Updated • 8