granite-docling-258M-f32-GGUF
Granite-Docling-258M is an ultra-compact, open-source vision-language model developed by IBM Research specifically for high-fidelity end-to-end document conversion. With 258 million parameters, it builds upon the Idefics3 architecture but replaces the vision encoder with SigLIP2 and upgrades the language model to Granite 165M, delivering significant improvements in accuracy and stability over its predecessor SmolDocling-256M. This model excels at preserving complex document structures such as tables, code blocks, mathematical equations, and reading order by outputting a structured markup format called DocTags, which can be converted to Markdown, HTML, or JSON for downstream AI workflows. It supports flexible inference modes, including full-page and bounding box-guided region inference, and offers experimental multilingual support for languages like Japanese, Arabic, and Chinese. Granite-Docling-258M is designed for cost-effective, reliable document understanding and integrates smoothly within the Docling library ecosystem and AI pipelines for enterprise applications.
Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion, SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
Model Files
| File Name | Quant Type | File Size |
|---|---|---|
| granite-docling-258M-BF16.gguf | BF16 | 332 MB |
| granite-docling-258M-F16.gguf | F16 | 332 MB |
| granite-docling-258M-F32.gguf | F32 | 660 MB |
| granite-docling-258M-Q8_0.gguf | Q8_0 | 178 MB |
| granite-docling-258M-mmproj-bf16.gguf | mmproj-bf16 | 190 MB |
| granite-docling-258M-mmproj-f16.gguf | mmproj-f16 | 190 MB |
| granite-docling-258M-mmproj-f32.gguf | mmproj-f32 | 374 MB |
| granite-docling-258M-mmproj-q8_0.gguf | mmproj-q8_0 | 104 MB |
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 728
8-bit
16-bit
32-bit
Model tree for prithivMLmods/granite-docling-258M-f32-GGUF
Base model
ibm-granite/granite-docling-258M