Model Details
Model type: Spatial Foundation Model for 3D Geometry Reconstruction
Model date: November 2025
Paper: OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer
Code: https://github.com/Livioni/OmniVGGT-official
Authors:
- Haosong Peng*, Hao Li*, Yalun Dai, Yushi Lan, Yihang Luo, Tianyu Qi, Zhengshen Zhang, Yufeng Zhan†, Junfei Zhang†, Wenchao Xu†, Ziwei Liu
* Equal Contribution, † Corresponding Author
Model Description
OmniVGGT is a spatial foundation model that can effectively benefit from an arbitrary number of auxiliary geometric modalities (depth, camera intrinsics, and pose) to obtain high-quality 3D geometric results. The model achieves state-of-the-art performance across various downstream tasks and further improves performance on robot manipulation tasks.
Direct Use
Quick Start
import torch
from omnivggt.models.omnivggt import OmniVGGT
# Load model
model = OmniVGGT()
model.load_state_dict(torch.load('path/to/model.pth'))
model.eval()
# Prepare inputs
inputs = {
'images': images, # torch.Tensor [B, N, 3, H, W]
'extrinsics': extrinsics, # optional
'intrinsics': intrinsics, # optional
'depth': depth, # optional
'mask': mask, # optional
}
# Run inference
with torch.no_grad():
predictions = model(**inputs)
Command Line Usage
# Basic usage - only images required
python inference.py --image_folder path/to/images/
# With auxiliary camera and depth information
python inference.py \
--image_folder path/to/images/ \
--camera_folder path/to/cameras/ \
--depth_folder path/to/depths/
Technical Specifications
Requirements
- Python 3.10+
- PyTorch 2.7.0+
- CUDA-compatible GPU (recommended)
- 8GB+ RAM
- 4GB+ GPU memory
Installation
conda create -n omnivggt python=3.10
conda activate omnivggt
pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt
Citation
@article{peng2025omnivggt,
title={OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer},
author={Peng, Haosong and Li, Hao and Dai, Yalun and Lan, Yushi and Luo, Yihang and Qi, Tianyu and Zhang, Zhengshen and Zhan, Yufeng and Zhang, Junfei and Xu, Wenchao and Liu, Ziwei},
journal={arXiv preprint arXiv:2511.10560},
year={2025}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for Livioni/OmniVGGT
Base model
facebook/VGGT-1B