Getting Error when using with AutoImageProcessor, AutoModel
#2
by
spedrox-sac
- opened
When using this model with this code provided in the github dinov3 repo:
import torch
from transformers import AutoImageProcessor, AutoModel
from transformers.image_utils import load_image
# Load image and model
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = load_image(url)
pretrained_model_name = "facebook/dinov3-vitb16-pretrain-lvd1689m"
processor = AutoImageProcessor.from_pretrained(pretrained_model_name)
model = AutoModel.from_pretrained(pretrained_model_name, device_map="auto")
# Process image and extract features
inputs = processor(images=image, return_tensors="pt").to(model.device)
with torch.inference_mode():
outputs = model(**inputs)
pooled_output = outputs.pooler_output
print("Pooled output shape:", pooled_output.shape)
I'm getting this value error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipython-input-1234491351.py in <cell line: 0>()
8
9 pretrained_model_name = "facebook/dinov3-vitb16-pretrain-lvd1689m"
---> 10 processor = AutoImageProcessor.from_pretrained(pretrained_model_name)
11 model = AutoModel.from_pretrained(pretrained_model_name, device_map="auto")
12
/usr/local/lib/python3.11/dist-packages/transformers/models/auto/image_processing_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
613 "This image processor cannot be instantiated. Please make sure you have `Pillow` installed."
614 )
--> 615 raise ValueError(
616 f"Unrecognized image processor in {pretrained_model_name_or_path}. Should have a "
617 f"`image_processor_type` key in its {IMAGE_PROCESSOR_NAME} of {CONFIG_NAME}, or one of the following "
ValueError: Unrecognized image processor in facebook/dinov3-vitb16-pretrain-lvd1689m. Should have a `image_processor_type` key in its preprocessor_config.json of config.json, or one of the following `model_type` keys in its config.json: aimv2, aimv2_vision_model, align, aria, beit, bit, blip, blip-2, bridgetower, chameleon, chinese_clip, clip, clipseg, cohere2_vision, conditional_detr, convnext, convnextv2, cvt, data2vec-vision, deepseek_vl, deepseek_vl_hybrid, deformable_detr, deit, depth_anything, depth_pro, deta, detr, dinat, dinov2, donut-swin, dpt, efficientformer, efficientloftr, efficientnet, eomt, flava, focalnet, fuyu, gemma3, gemma3n, git, glm4v, glpn, got_ocr2, grounding-dino, groupvit, hiera, idefics, idefics2, idefics3, ijepa, imagegpt, instructblip, instructblipvideo, janus, kosmos-2, layoutlmv2, layoutlmv3, levit, lightglue, llama4, llava, llava_next, llava_next_video, llava_onevision, mask2former, maskformer, mgp-str, mistral3, mlcd, mllama, mm-grounding-dino, mobilenet_v1, mobilenet_v2, mobilevit, mobilevitv2, nat, nougat, oneformer, owlv2, owlvit, paligemma, perceiver, perception_lm, phi4_multimodal, pix2struct, pixtral, poolformer, prompt_depth_anything, pvt, pvt_v2, qwen2_5_vl, qwen2_vl, regnet, resnet, rt_detr, sam, sam_hq, segformer, seggpt, shieldgemma2, siglip, siglip2, smolvlm, superglue, superpoint, swiftformer, swin, swin2sr, swinv2, table-transformer, timesformer, timm_wrapper, tvlt, tvp, udop, upernet, van, videomae, vilt, vipllava, ...
patricklabatut
changed discussion status to
closed
I am updated past 4.56.0 and I still get this error
I am updated past 4.56.0 and I still get this error
Could you please double-check your installation (e.g. re-create a fresh environment with HF Transformers 4.56.0 just to test this issue)?
I was able to get around this by making a custom Preprocessor transform that matches the source code for DINOv3