Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

40,198

Full-text search

Active filters: 4-bit

Disty0/Z-Image-Turbo-SDNQ-uint4-svd-r32

Updated 3 days ago • 15.6k • 41

huihui-ai/Huihui-GLM-4.6-abliterated-mlx-4bit

Text Generation • 353B • Updated 3 days ago • 326 • 12

QuantTrio/DeepSeek-V3.2-Speciale-AWQ

Text Generation • 685B • Updated 2 days ago • 41 • 4

mlx-community/Llama-3.2-3B-Instruct-4bit

Text Generation • 0.5B • Updated Mar 5 • 12.4k • 37

nightmedia/gpt-oss-120b-heretic-v2-mxfp4-q8-hi-mlx

Text Generation • 117B • Updated 16 days ago • 606 • 5

mlx-community/Orchestrator-8B-4bit

Text Generation • 1B • Updated 7 days ago • 175 • 3

MaziyarPanahi/Ministral-3-3B-Reasoning-2512-GGUF

Text Generation • 3B • Updated 4 days ago • 11.9k • 3

hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4

Text Generation • 2B • Updated Aug 7, 2024 • 224k • 80

Qwen/Qwen3-32B-AWQ

Text Generation • 6B • Updated May 21 • 126k • 116

Intel/DeepSeek-R1-0528-Qwen3-8B-int4-AutoRound

2B • Updated 4 days ago • 525 • 6

Intel/DeepSeek-V3.1-Terminus-int4-mixed-AutoRound

Text Generation • Updated Sep 23 • 78 • 4

QuantTrio/Qwen3-VL-32B-Instruct-AWQ

Image-Text-to-Text • 33B • Updated Oct 22 • 11.2k • 8

nhe-ai/maya1-mlx-4Bit

Text-to-Speech • 0.5B • Updated 24 days ago • 173 • 3

nightmedia/Qwen3-4B-Architect-mxfp4-mlx

Text Generation • 0.8B • Updated 7 days ago • 165 • 2

MaziyarPanahi/NVIDIA-Nemotron-Nano-9B-v2-GGUF

Text Generation • 9B • Updated 7 days ago • 568 • 2

0xSero/GLM-4.6-REAP-218B-A32B-W4A16-AutoRound

Text Generation • 2B • Updated 5 days ago • 411 • 2

VibeStudio/MiniMax-M2-THRIFT-55-MLX-4bit

106B • Updated 4 days ago • 92 • 2

unsloth/Ministral-3-14B-Base-2512-bnb-4bit

14B • Updated 4 days ago • 134 • 2

QuantTrio/DeepSeek-V3.2-AWQ

Text Generation • 685B • Updated 3 days ago • 395 • 2

TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ

Text Generation • 4B • Updated Sep 27, 2023 • 36.7k • 590

TheBloke/vicuna-7B-v1.5-GPTQ

Text Generation • 1B • Updated Sep 27, 2023 • 76 • 17

TheBloke/dolphin-2.2.1-mistral-7B-AWQ

Text Generation • 1B • Updated Nov 9, 2023 • 88 • 16

TheBloke/deepseek-coder-1.3b-instruct-AWQ

Text Generation • 0.3B • Updated Nov 9, 2023 • 97 • 4

unsloth/llama-3-70b-bnb-4bit

Text Generation • 37B • Updated Nov 22, 2024 • 1.03k • 47

lllyasviel/omost-llama-3-8b-4bits

Text Generation • 5B • Updated May 29, 2024 • 269 • 24

unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit

Text Generation • 5B • Updated Feb 15 • 271k • 89

unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit

Text Generation • 37B • Updated Nov 22, 2024 • 6.32k • 32

kaitchup/Mistral-Nemo-Base-2407-AutoRound-GPTQ-sym-4bit

Text Generation • 3B • Updated Aug 26, 2024 • 9 • 1

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4

Image-Text-to-Text • 13B • Updated Sep 24, 2024 • 300 • 29

Qwen/Qwen2.5-32B-Instruct-AWQ

Text Generation • 6B • Updated Oct 9, 2024 • 1.25M • 89