Dynamic 3-bit DeepSeek V3.1 GGUF gets 75.6% on Aider Polyglot

#10

pinned

by shimmyshimmer - opened Sep 10

Discussion

shimmyshimmer

Unsloth AI org Sep 10

•

edited Sep 10

Hey everyone, ever since we released Dynamic GGUFs, we've received so much love thanks to you all, but we know better benchmarking was a top request!
Previously, we already benchmarked Gemma 3 and Llama 4 on 5-shot MMLU and KL Divergence but we're happy to showcase Aider Polyglot benchmarks for our DeepSeek-V3.1 GGUFs and were quite surprised by the results! Blogpost + details: https://docs.unsloth.ai/basics/unsloth-dynamic-ggufs-on-aider-polyglot

Our 1-bit Unsloth Dynamic GGUF shrinks DeepSeek-V3.1 from 671GB → 192GB (-75% size) and no-thinking mode outperforms GPT-4.1 (Apr 2025), GPT-4.5, and DeepSeek-V3-0324.
3-bit Unsloth DeepSeek-V3.1 (thinking) GGUF: Outperforms Claude-4-Opus (thinking).
5-bit Unsloth DeepSeek-V3.1 (non-thinking) GGUF: Matches Claude-4-Opus (non-thinking) performance.
Our Dynamic GGUFs perform consistently better than other non-Unsloth Dynamic imatrix GGUFs
Other non-Unsloth 1-bit and 2-bit DeepSeek-V3.1 quantizations, as well as standard 1-bit quantization without selective layer quantization, either failed to load or produced gibberish and looping outputs.

For our DeepSeek-V3.1 experiments, we compared different bits of Unsloth Dynamic GGUFs against:

Full-precision, unquantized LLMs including GPT 4.5, 4.1, Claude-4-Opus, DeepSeek-V3-0324 etc.
Other dynamic imatrix V3.1 GGUFs
Semi-dynamic (some selective layer quantization) imatrix V3.1 GGUFs for ablation purposes.

Benchmark experiments were mainly conducted by David (neolithic5452 on Aider Disc), a trusted community contributor to Aider Polyglot evaluations. Tests were run ~3 times and averaged for a median score, and the Pass-2 accuracy is reported as by convention.

Thanks guys!

Michael

shimmyshimmer pinned discussion Sep 10

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment