Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,71 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: cc-by-nc-4.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Auto-Regressively Generating Multi-View Consistent Images
|
| 6 |
+
|
| 7 |
+
[JiaKui Hu](https://jkhu29.github.io/)\*, [Yuxiao Yang](https://yuxiaoyang23.github.io/)\*, [Jialun Liu](https://scholar.google.com/citations?user=OkMMP2AAAAAJ), [Jinbo Wu](https://scholar.google.com/citations?user=9OecN2sAAAAJ), [Chen Zhao](), [Yanye Lu](https://scholar.google.com/citations?user=WSFToOMAAAAJ)
|
| 8 |
+
<br>PKU, BaiduVis, THU<br>
|
| 9 |
+
|
| 10 |
+
## Introduction
|
| 11 |
+
|
| 12 |
+

|
| 13 |
+
|
| 14 |
+
Diffusion-based multi-view image generation methods use a specific reference view for predicting subsequent views, which becomes problematic when overlap between the reference view and the predicted view is minimal, affecting image quality and multi-view consistency. Our MV-AR addresses this by using the preceding view with significant overlap for conditioning.
|
| 15 |
+
|
| 16 |
+
## Results
|
| 17 |
+
|
| 18 |
+
### Text to Multiview images
|
| 19 |
+
|
| 20 |
+

|
| 21 |
+
|
| 22 |
+
### Image to Multiview images
|
| 23 |
+
|
| 24 |
+

|
| 25 |
+
|
| 26 |
+
### Text + Geometric to Multiview images
|
| 27 |
+
|
| 28 |
+

|
| 29 |
+
|
| 30 |
+
## Quick Start
|
| 31 |
+
|
| 32 |
+
### Requirements
|
| 33 |
+
|
| 34 |
+
> Please follow the instructions in [code](https://github.com/MILab-PKU/MVAR).
|
| 35 |
+
|
| 36 |
+
### Reproduce
|
| 37 |
+
|
| 38 |
+
1. Please download [flan-t5-xl](https://huggingface.co/google/flan-t5-xl) in `./pretrained_models`;
|
| 39 |
+
2. Please download [Cap3D_automated_Objaverse_full.csv](https://huggingface.co/datasets/tiange/Cap3D/blob/main/Cap3D_automated_Objaverse_full.csv) in `dataset/captions`;
|
| 40 |
+
3. Please download models here, put them in `./pretrained_models`;
|
| 41 |
+
4. Run:
|
| 42 |
+
|
| 43 |
+
```shell
|
| 44 |
+
# For t2mv on objaverse
|
| 45 |
+
sh sample_tcam2i.sh
|
| 46 |
+
# For t2mv on GSO
|
| 47 |
+
sh sample_icam2i_gso.sh
|
| 48 |
+
# For i2mv on GSO
|
| 49 |
+
sh sample_icam2i_gso.sh
|
| 50 |
+
```
|
| 51 |
+
|
| 52 |
+
The generated images will be saved to `samples_objaverse_nv_ray/`.
|
| 53 |
+
|
| 54 |
+
## Acknowledgement
|
| 55 |
+
|
| 56 |
+
This repository is heavily based on [LlamaGen](https://github.com/FoundationVision/LlamaGen). We would like to thank the authors of these work for publicly releasing their code.
|
| 57 |
+
|
| 58 |
+
For help or issues using this git, please feel free to submit a GitHub issue.
|
| 59 |
+
|
| 60 |
+
For other communications related to this git, please contact `jkhu29@stu.pku.edu.cn`.
|
| 61 |
+
|
| 62 |
+
### Citation
|
| 63 |
+
|
| 64 |
+
```bibtex
|
| 65 |
+
@article{hu2025mvar,
|
| 66 |
+
title={Auto-Regressively Generating Multi-View Consistent Images},
|
| 67 |
+
author={Hu, JiaKui and Yang, Yuxiao and Liu, Jialun and Wu, Jinbo and Zhao, Chen and Lu, Yanye},
|
| 68 |
+
journal={arXiv preprint arXiv:2506.18527},
|
| 69 |
+
year={2025}
|
| 70 |
+
}
|
| 71 |
+
```
|