Jiakui commited on
Commit
1e9a2ee
·
verified ·
1 Parent(s): 46f096f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -3
README.md CHANGED
@@ -1,3 +1,71 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ ---
4
+
5
+ # Auto-Regressively Generating Multi-View Consistent Images
6
+
7
+ [JiaKui Hu](https://jkhu29.github.io/)\*, [Yuxiao Yang](https://yuxiaoyang23.github.io/)\*, [Jialun Liu](https://scholar.google.com/citations?user=OkMMP2AAAAAJ), [Jinbo Wu](https://scholar.google.com/citations?user=9OecN2sAAAAJ), [Chen Zhao](), [Yanye Lu](https://scholar.google.com/citations?user=WSFToOMAAAAJ)
8
+ <br>PKU, BaiduVis, THU<br>
9
+
10
+ ## Introduction
11
+
12
+ ![overview](https://raw.githubusercontent.com/MILab-PKU/MVAR/refs/heads/main/assets/MVAR_overview.png)
13
+
14
+ Diffusion-based multi-view image generation methods use a specific reference view for predicting subsequent views, which becomes problematic when overlap between the reference view and the predicted view is minimal, affecting image quality and multi-view consistency. Our MV-AR addresses this by using the preceding view with significant overlap for conditioning.
15
+
16
+ ## Results
17
+
18
+ ### Text to Multiview images
19
+
20
+ ![t2mv](https://raw.githubusercontent.com/MILab-PKU/MVAR/refs/heads/main/assets/t2mv_compare.png)
21
+
22
+ ### Image to Multiview images
23
+
24
+ ![i2mv](https://raw.githubusercontent.com/MILab-PKU/MVAR/refs/heads/main/assets/i2mv_compare.png)
25
+
26
+ ### Text + Geometric to Multiview images
27
+
28
+ ![ts2mv](https://raw.githubusercontent.com/MILab-PKU/MVAR/refs/heads/main/assets/ts2mv_cases.png)
29
+
30
+ ## Quick Start
31
+
32
+ ### Requirements
33
+
34
+ > Please follow the instructions in [code](https://github.com/MILab-PKU/MVAR).
35
+
36
+ ### Reproduce
37
+
38
+ 1. Please download [flan-t5-xl](https://huggingface.co/google/flan-t5-xl) in `./pretrained_models`;
39
+ 2. Please download [Cap3D_automated_Objaverse_full.csv](https://huggingface.co/datasets/tiange/Cap3D/blob/main/Cap3D_automated_Objaverse_full.csv) in `dataset/captions`;
40
+ 3. Please download models here, put them in `./pretrained_models`;
41
+ 4. Run:
42
+
43
+ ```shell
44
+ # For t2mv on objaverse
45
+ sh sample_tcam2i.sh
46
+ # For t2mv on GSO
47
+ sh sample_icam2i_gso.sh
48
+ # For i2mv on GSO
49
+ sh sample_icam2i_gso.sh
50
+ ```
51
+
52
+ The generated images will be saved to `samples_objaverse_nv_ray/`.
53
+
54
+ ## Acknowledgement
55
+
56
+ This repository is heavily based on [LlamaGen](https://github.com/FoundationVision/LlamaGen). We would like to thank the authors of these work for publicly releasing their code.
57
+
58
+ For help or issues using this git, please feel free to submit a GitHub issue.
59
+
60
+ For other communications related to this git, please contact `jkhu29@stu.pku.edu.cn`.
61
+
62
+ ### Citation
63
+
64
+ ```bibtex
65
+ @article{hu2025mvar,
66
+ title={Auto-Regressively Generating Multi-View Consistent Images},
67
+ author={Hu, JiaKui and Yang, Yuxiao and Liu, Jialun and Wu, Jinbo and Zhao, Chen and Lu, Yanye},
68
+ journal={arXiv preprint arXiv:2506.18527},
69
+ year={2025}
70
+ }
71
+ ```