Qwen3 VisionCaption
Collection
abliterated & [safe and controlled caption generation]
•
4 items
•
Updated
Qwen3-VisionCaption-2B is an abliterated v1.0 variant built upon Qwen3-VL-2B-Instruct-abliterated-v1, specifically optimized for seamless, high precision image captioning and uncensored visual analysis. It is engineered for robust caption generation, deep reasoning, and unrestricted descriptive understanding across diverse visual and multimodal contexts.
This model was fine tuned using the following datasets:
The training objective focused on enhancing performance in unconstrained descriptive image captioning, particularly for edge cases and visual categories that are typically filtered out in standard captioning benchmarks.
from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch
model = Qwen3VLForConditionalGeneration.from_pretrained(
"prithivMLmods/Qwen3-VisionCaption-2B", torch_dtype="auto", device_map="auto"
)
processor = AutoProcessor.from_pretrained("prithivMLmods/Qwen3-VisionCaption-2B")
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
},
{"type": "text", "text": "Provide a detailed caption and reasoning for this image."},
],
}
]
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
)
inputs = inputs.to("cuda")
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [
out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)
Find the Quants (GGUF) here: https://huggingface.co/prithivMLmods/Qwen3-VisionCaption-2B-GGUF
Base model
Qwen/Qwen3-VL-2B-Instruct