FP8 Quantized model of ANIMA
!! I hava changed models recently. Please redownload if hash is different. !!
There are two models - FP8 and NVFP4Mixed.
- FP8 (2.4GB) : (recommend) maximize generation speed while preserving quality as much as possible.
- NVFP4Mixed (2.0GB): (marginal quality) Mixture of FP8 and NVFP4.
To use torch.compile, use the TorchCompileModelAdvanced node from KJNodes, set the mode to max-autotune-no-cudagraphs, and make sure dynamic is set to false.
Generation speed
Tested on
- RTX5090 (400W), ComfyUI with
--fastoption, torch2.10.0+cu130 - Generates 832x1216, 30steps, cfg 4.0, er sde, simple
| quant | none | sage+torch.compile |
|---|---|---|
| bf16 | 7.13s/4.21it/s | 5.16s/5.81it/s (+38%) |
| fp8 | 6.66s/4.50it/s (+11%) | 4.52s/6.64it/s (+58%) |
| nvfp4mix | 6.37s/4.71it/s (+12%) | 4.99s/6.01it/s (+43%) |
Sample
Quantized layers
fp8
{
"format": "comfy_quant",
"block_names": ["net.blocks."],
"rules": [
{ "policy": "keep", "match": ["blocks.0", "blocks.1."] },
{ "policy": "float8_e4m3fn", "match": ["q_proj", "k_proj", "v_proj", "o_proj", "output_proj", ".mlp"] },
{ "policy": "nvfp4", "match": [] }
]
}
nvfp4mixed
{
"format": "comfy_quant",
"block_names": ["net.blocks."],
"rules": [
{ "policy": "keep", "match": ["blocks.0."] },
{ "policy": "float8_e4m3fn", "match": [
"blocks.1.k_proj", "blocks.1.q_proj", "blocks.1.output_proj",
"blocks.27.k_proj", "blocks.27.q_proj", "blocks.27.output_proj",
"v_proj", "adaln_modulation", ".mlp"
] },
{ "policy": "nvfp4", "match": ["k_proj", "q_proj", "output_proj"] }
]
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for Bedovyy/Anima-FP8
Base model
circlestone-labs/Anima

