GGUF quants of stepfun-ai/Step-3.5-Flash

Quantization was performed without imatrix for the purposes comparison and experimentation. Perplixity may be worse than expected due to this naive approach.

Sample outputs and comparative evaluation coming eventually.

Name	Version
stepfun-ai/Step-3.5-Flash	a9197e1b758e
`convert_hf_to_gguf.py`, `llama-quantize` and `llama-gguf-split`	b7964

See the original model card here.

Downloads last month: 296

GGUF

Model size

197B params

Architecture

step35

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for terribleplan/stepfun-ai_Step-3.5-Flash

Base model

stepfun-ai/Step-3.5-Flash

Quantized

(17)

this model