Qwen3-VL-2B-Instruct-GGUF

Qwen3-VL-2B-Instruct is a highly advanced 2-billion-parameter vision-language model from the Qwen3 series, designed to deliver superior multimodal understanding and generation by seamlessly integrating deep visual perception with strong text understanding and generation capabilities. It supports extremely long context lengths of up to 256K tokens (expandable to 1 million), enabling it to handle large documents, books, or long videos with detailed recall and temporal event tracking. The model features advanced spatial reasoning including 2D and 3D object grounding, enhanced visual coding abilities to generate code from images or videos (e.g., Draw.io, HTML, CSS, JavaScript), and agent-like functions to interpret and interact with GUI elements on PC/mobile interfaces. It is optimized for efficient deployment with instruction tuning for flexible interactive tasks and strong multimodal reasoning, particularly excelling in STEM and math problem-solving, causal analysis, and evidence-based answers, making it a powerful tool for diverse vision-language applications across textual, visual, and temporal domains.

Model Files

File Name Quant Type File Size
Qwen3-VL-2B-Instruct-BF16.gguf BF16 3.45 GB
Qwen3-VL-2B-Instruct-F16.gguf F16 3.45 GB
Qwen3-VL-2B-Instruct-F32.gguf F32 6.89 GB
Qwen3-VL-2B-Instruct-Q8_0.gguf Q8_0 1.83 GB
Qwen3-VL-2B-Instruct-mmproj-bf16.gguf mmproj-bf16 823 MB
Qwen3-VL-2B-Instruct-mmproj-f16.gguf mmproj-f16 819 MB
Qwen3-VL-2B-Instruct-mmproj-q8_0.gguf mmproj-q8_0 445 MB

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
800
GGUF
Model size
2B params
Architecture
qwen3vl
Hardware compatibility
Log In to view the estimation

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for prithivMLmods/Qwen3-VL-2B-Instruct-GGUF

Quantized
(36)
this model