kelseye commited on
Commit
443818f
·
verified ·
1 Parent(s): 094025a

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,32 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ samples/e1.png filter=lfs diff=lfs merge=lfs -text
37
+ samples/e2.png filter=lfs diff=lfs merge=lfs -text
38
+ samples/e3.png filter=lfs diff=lfs merge=lfs -text
39
+ samples/e4.png filter=lfs diff=lfs merge=lfs -text
40
+ samples/e5.png filter=lfs diff=lfs merge=lfs -text
41
+ samples/e6.png filter=lfs diff=lfs merge=lfs -text
42
+ samples/e7_1.png filter=lfs diff=lfs merge=lfs -text
43
+ samples/e7_2.png filter=lfs diff=lfs merge=lfs -text
44
+ samples/e7_3.png filter=lfs diff=lfs merge=lfs -text
45
+ samples/e7_4.png filter=lfs diff=lfs merge=lfs -text
46
+ samples/ic_target.png filter=lfs diff=lfs merge=lfs -text
47
+ samples/inpaint_i1.jpg filter=lfs diff=lfs merge=lfs -text
48
+ samples/inpaint_i2.png filter=lfs diff=lfs merge=lfs -text
49
+ samples/inpaint_o1.png filter=lfs diff=lfs merge=lfs -text
50
+ samples/inpaint_o2.png filter=lfs diff=lfs merge=lfs -text
51
+ samples/ip_1.png filter=lfs diff=lfs merge=lfs -text
52
+ samples/ip_2.png filter=lfs diff=lfs merge=lfs -text
53
+ samples/ip_3.png filter=lfs diff=lfs merge=lfs -text
54
+ samples/ip_ref.png filter=lfs diff=lfs merge=lfs -text
55
+ samples/regional_attention.jpg filter=lfs diff=lfs merge=lfs -text
56
+ samples/styled_entity_control_example_1_mask_0.png filter=lfs diff=lfs merge=lfs -text
57
+ samples/styled_entity_control_example_2_mask_0.png filter=lfs diff=lfs merge=lfs -text
58
+ samples/styled_entity_control_example_3_mask_27.png filter=lfs diff=lfs merge=lfs -text
59
+ samples/styled_entity_control_example_4_mask_21.png filter=lfs diff=lfs merge=lfs -text
60
+ samples/styled_entity_control_example_5_mask_0.png filter=lfs diff=lfs merge=lfs -text
61
+ samples/styled_entity_control_example_6_mask_8.png filter=lfs diff=lfs merge=lfs -text
62
+ samples/styled_entity_control_example_7_mask_5.png filter=lfs diff=lfs merge=lfs -text
63
+ samples/styled_entity_control_example_7_mask_6.png filter=lfs diff=lfs merge=lfs -text
64
+ samples/video.mp4 filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # EliGen: Entity-Level Controlled Image Generation
5
+
6
+ ## Introduction
7
+
8
+ We propose EliGen, a novel approach that leverages fine-grained entity-level information to enable precise and controllable text-to-image generation. EliGen excels in tasks such as entity-level controlled image generation and image inpainting, while its applicability is not limited to these areas. Additionally, it can be seamlessly integrated with existing community models, such as the IP-Adapter and In-Context LoRA.
9
+
10
+ * Paper: [EliGen: Entity-Level Controlled Image Generation with Regional Attention](https://arxiv.org/abs/2501.01097)
11
+ * Github: [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio)
12
+ * Model: [ModelScope](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen)
13
+ * Online Demo: [ModelScope EliGen Studio](https://www.modelscope.cn/studios/DiffSynth-Studio/EliGen)
14
+ * Training dataset: [ModelScope Dataset](https://www.modelscope.cn/datasets/DiffSynth-Studio/EliGenTrainSet)
15
+
16
+ ## Methodology
17
+
18
+ ![regional-attention](./samples/regional_attention.jpg)
19
+
20
+ We introduce a regional attention mechanism within the DiT framework to effectively process the conditions of each entity. This mechanism enables the local prompt associated with each entity to semantically influence specific regions through regional attention. To further enhance the layout control capabilities of EliGen, we meticulously contribute an entity-annotated dataset and fine-tune the model using the LoRA framework.
21
+
22
+ 1. **Regional Attention**: Regional attention is shown in the above figure, which can be easily applied to other text-to-image models. Its core principle involves transforming the positional information of each entity into an attention mask, ensuring that the mechanism only affects the designated regions.
23
+
24
+ 2. **Dataset with Entity Annotation**: To construct a dedicated entity control dataset, we start by randomly selecting captions from DiffusionDB and generating the corresponding source image using Flux. Next, we employ Qwen2-VL 72B, recognized for its advanced grounding capabilities among MLLMs, to randomly identify entities within the image. These entities are annotated with local prompts and bounding boxes for precise localization, forming the foundation of our dataset for further training.
25
+
26
+ 3. **Training**: We utilize LoRA (Low-Rank Adaptation) and DeepSpeed to fine-tune regional attention mechanisms using a curated dataset, enabling our EliGen model to achieve effective entity-level control.
27
+
28
+ ## Usage
29
+ This model was trained using [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio). We recommend using DiffSynth-Studio for generation.
30
+ ```shell
31
+ git clone https://github.com/modelscope/DiffSynth-Studio.git
32
+ cd DiffSynth-Studio
33
+ pip install -e .
34
+ ```
35
+ 1. **Entity-Level Controlled Image Generation**
36
+ EliGen achieves effective entity-level control results. See [entity_control.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control.py) for usage.
37
+ 2. **Image Inpainting**
38
+ To apply EliGen to image inpainting tasks, we propose an inpainting fusion pipeline that preserves non-inpainted areas while enabling precise, entity-level modifications within inpainted regions.
39
+ See [entity_inpaint.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_inpaint.py) for usage.
40
+ 3. **Styled Entity Control**
41
+ EliGen can be seamlessly integrated with existing community models. We provide an example of integrating it with IP-Adapter. See [entity_control_ipadapter.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control_ipadapter.py) for usage.
42
+ 4. **Entity Transfer**
43
+ We provide an example of integrating EliGen with In-Context LoRA, achieving interesting entity transfer results. See [entity_transfer.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_transfer.py) for usage.
44
+ 5. **Play with EliGen using UI**
45
+ Download the EliGen checkpoint from [ModelScope](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen) to `models/lora/entity_control` and run the following command to launch the interactive UI:
46
+ ```bash
47
+ python apps/gradio/entity_level_control.py
48
+ ```
49
+
50
+ ## Examples
51
+ ### Entity-Level Controlled Image Generation
52
+
53
+ 1. Generating images with continuously changing entity positions.
54
+
55
+ <center>
56
+ <video muted="true" autoplay="true" loop="true" height="512" width="512" src="./samples/video.mp4"></video>
57
+ </center>
58
+
59
+ 1. Image generation results for complex combinations of entities, demonstrating the strong generalization capability of EliGen. See [entity_control.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control.py) `example_1-6` for the generation prompts.
60
+
61
+ |Entity Conditions|Generated Image|
62
+ |-|-|
63
+ |![eligen_example_1_mask_0](./samples/e1_m.png)|![eligen_example_1_0](./samples/e1.png)|
64
+ |![eligen_example_2_mask_0](./samples/e2_m.png)|![eligen_example_2_0](./samples/e2.png)|
65
+ |![eligen_example_3_mask_27](./samples/e3_m.png)|![eligen_example_3_27](./samples/e3.png)|
66
+ |![eligen_example_4_mask_21](./samples/e4_m.png)|![eligen_example_4_21](./samples/e4.png)|
67
+ |![eligen_example_5_mask_0](./samples/e5_m.png)|![eligen_example_5_0](./samples/e5.png)|
68
+ |![eligen_example_6_mask_8](./samples/e6_m.png)|![eligen_example_6_8](./samples/e6.png)|
69
+
70
+ 2. Demonstration of the robustness of EliGen. The following examples are generated using the same prompt but different random seeds. Refer to [entity_control.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control.py) `example_7` for the generation prompt.
71
+
72
+ |Entity Conditions|Generated Image|
73
+ |-|-|
74
+ |![eligen_example_7_mask_5](./samples/e7_m.png)|![eligen_example_7_5](./samples/e7_1.png)|
75
+ |![eligen_example_7_mask_5](./samples/e7_m.png)|![eligen_example_7_6](./samples/e7_2.png)|
76
+ |![eligen_example_7_mask_5](./samples/e7_m.png)|![eligen_example_7_7](./samples/e7_3.png)|
77
+ |![eligen_example_7_mask_5](./samples/e7_m.png)|![eligen_example_7_8](./samples/e7_4.png)|
78
+
79
+ ### Image Inpainting
80
+ Demonstration of the inpainting mode in EliGen. See [entity_inpaint.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_inpaint.py) for generation prompts.
81
+ |Inpainting Input|Inpainting Output|
82
+ |-|-|
83
+ |![inpaint_i1](./samples/inpaint_i1.jpg)|![inpaint_o1](./samples/inpaint_o1.png)|
84
+ |![inpaint_i2](./samples/inpaint_i2.png)|![inpaint_o2](./samples/inpaint_o2.png)|
85
+
86
+ ### Styled Entity Control
87
+ Demonstration of styled entity control results using EliGen and IP-Adapter. See [entity_control_ipadapter.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control_ipadapter.py) for generation prompts.
88
+ |Style Reference|Entity Control Variation 1|Entity Control Variation 2|Entity Control Variation 3|
89
+ |-|-|-|-|
90
+ |![ip_ref](./samples/ip_ref.png)|![ip_1](./samples/ip_1.png)|![ip_2](./samples/ip_2.png)|![ip_3](./samples/ip_3.png)|
91
+
92
+ We also provide a demo of styled entity control results using EliGen with a specific style LoRA. See [./styled_entity_control.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/styled_entity_control.py) for details. Below is the visualization of EliGen combined with the [Lego DreamBooth LoRA](https://huggingface.co/merve/flux-lego-lora-dreambooth).
93
+ |![image_1_base](./samples/styled_entity_control_example_1_mask_0.png)|![result1](./samples/styled_entity_control_example_2_mask_0.png)|![result2](./samples/styled_entity_control_example_3_mask_27.png)|![result3](./samples/styled_entity_control_example_4_mask_21.png)|
94
+ |-|-|-|-|
95
+ |![image_1_base](./samples/styled_entity_control_example_5_mask_0.png)|![result1](./samples/styled_entity_control_example_6_mask_8.png)|![result2](./samples/styled_entity_control_example_7_mask_5.png)|![result3](./samples/styled_entity_control_example_7_mask_6.png)|
96
+
97
+ ### Entity Transfer
98
+ Demonstration of the entity transfer results using EliGen and In-Context LoRA. See [entity_transfer.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_transfer.py) for generation prompts.
99
+
100
+ |Entity to Transfer|Transfer Target Image|Transfer Example 1|Transfer Example 2|
101
+ |-|-|-|-|
102
+ |![ic_logo](./samples/ic_logo.jpg)|![ic_target](./samples/ic_target.png)|![ic_1](./samples/ic_1.jpg)|![ic_2](./samples/ic_2.jpg)|
103
+
104
+ ## Citation
105
+ If you find our work helpful, please consider citing us:
106
+ ```
107
+ @article{zhang2025eligen,
108
+ title={Eligen: Entity-level controlled image generation with regional attention},
109
+ author={Zhang, Hong and Duan, Zhongjie and Wang, Xingjun and Chen, Yingda and Zhang, Yu},
110
+ journal={arXiv preprint arXiv:2501.01097},
111
+ year={2025}
112
+ }
113
+ ```
README_from_modelscope.md ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ frameworks:
3
+ - Pytorch
4
+ license: Apache License 2.0
5
+ tasks:
6
+ - text-to-image-synthesis
7
+
8
+ #model-type:
9
+ ##如 gpt、phi、llama、chatglm、baichuan 等
10
+ #- gpt
11
+
12
+ #domain:
13
+ ##如 nlp、cv、audio、multi-modal
14
+ #- nlp
15
+
16
+ #language:
17
+ ##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
18
+ #- cn
19
+
20
+ #metrics:
21
+ ##如 CIDEr、Blue、ROUGE 等
22
+ #- CIDEr
23
+
24
+ #tags:
25
+ ##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
26
+ #- pretrained
27
+
28
+ #tools:
29
+ ##如 vllm、fastchat、llamacpp、AdaSeq 等
30
+ #- vllm
31
+ base_model:
32
+ - black-forest-labs/FLUX.1-dev
33
+ tags:
34
+ - lora
35
+ ---
36
+ # EliGen: Entity-Level Controlled Image Generation
37
+
38
+ ## Introduction
39
+
40
+ We propose EliGen, a novel approach that leverages fine-grained entity-level information to enable precise and controllable text-to-image generation. EliGen excels in tasks such as entity-level controlled image generation and image inpainting, while its applicability is not limited to these areas. Additionally, it can be seamlessly integrated with existing community models, such as the IP-Adpater and In-Cotext LoRA.
41
+
42
+ * Paper: [EliGen: Entity-Level Controlled Image Generation with Regional Attention](https://arxiv.org/abs/2501.01097)
43
+ * Github: [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio)
44
+ * Model: [ModelScope](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen)
45
+ * Online Demo: [ModelScope EliGen Studio](https://www.modelscope.cn/studios/DiffSynth-Studio/EliGen)
46
+ * Training dataset: [ModelScope Dataset](https://www.modelscope.cn/datasets/DiffSynth-Studio/EliGenTrainSet)
47
+
48
+ ## Methodology
49
+
50
+ ![regional-attention](./samples/regional_attention.jpg)
51
+
52
+ We introduce a regional attention mechanism within the DiT framework to effectively process the conditions of each entity. This mechanism enables the local prompt associated with each entity to semantically influence specific regions through regional attention. To further enhance the layout control capabilities of EliGen, we meticulously contribute an entity-annotated dataset and fine-tune the model using the LoRA framework.
53
+
54
+ 1. **Regional Attention**: Regional attention is shown in above figure, which can be easily applied to other text-to-image models. Its core principle involves transforming the positional information of each entity into an attention mask, ensuring that the mechanism only affects the designated regions.
55
+
56
+ 2. **Dataset with Entity Annotation**: To construct a dedicated entity control dataset, we start by randomly selecting captions from DiffusionDB and generating the corresponding source image using Flux. Next, we employ Qwen2-VL 72B, recognized for its advanced grounding capabilities among MLLMs, to randomly identify entities within the image. These entities are annotated with local prompts and bounding boxes for precise localization, forming the foundation of our dataset for further training.
57
+
58
+ 3. **Training**: We utilize LoRA (Low-Rank Adaptation) and DeepSpeed to fine-tune regional attention mechanisms using a curated dataset, enabling our EliGen model to achieve effective entity-level control.
59
+
60
+ ## Usage
61
+ This model was trained using [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio). We recommend using DiffSynth-Studio for generation.
62
+ ```shell
63
+ git clone https://github.com/modelscope/DiffSynth-Studio.git
64
+ cd DiffSynth-Studio
65
+ pip install -e .
66
+ ```
67
+ 1. **Entity-Level Controlled Image Generation**
68
+ EliGen achieves effective entity-level control results. See [entity_control.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control.py) for usage.
69
+ 2. **Image Inpainting**
70
+ To apply EliGen to image inpainting task, we propose a inpainting fusion pipeline to preserve the non-painting areas while enabling precise, entity-level modifications over inpaining regions.
71
+ See [entity_inpaint.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_inpaint.py) for usage.
72
+ 3. **Styled Entity Control**
73
+ EliGen can be seamlessly integrated with existing community models. We have provided an example of how to integrate it with the IP-Adpater. See [entity_control_ipadapter.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control_ipadapter.py) for usage.
74
+ 4. **Entity Transfer**
75
+ We have provided an example of how to integrate EliGen with In-Cotext LoRA, which achieves interesting entity transfer results. See [entity_transfer.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_transfer.py) for usage.
76
+ 5. **Play with EliGen using UI**
77
+ Download the checkpoint of EliGen from [ModelScope](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen) to `models/lora/entity_control` and run the following command to try interactive UI:
78
+ ```bash
79
+ python apps/gradio/entity_level_control.py
80
+ ```
81
+ ## Examples
82
+ ### Entity-Level Controlled Image Generation
83
+
84
+ 1. The effect of generating images with continuously changing entity positions.
85
+
86
+ <center>
87
+ <video muted="true" autoplay="true" loop="true" height="512" width="512" src="./samples/video.mp4"></video>
88
+ </center>
89
+
90
+ 1. The image generation effect of complex Entity combinations, demonstrating the strong generalization of EliGen. See [entity_control.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control.py) `example_1-6` for generation prompts.
91
+
92
+ |Entity Conditions|Generated Image|
93
+ |-|-|
94
+ |![eligen_example_1_mask_0](./samples/e1_m.png)|![eligen_example_1_0](./samples/e1.png)|
95
+ |![eligen_example_2_mask_0](./samples/e2_m.png)|![eligen_example_2_0](./samples/e2.png)|
96
+ |![eligen_example_3_mask_27](./samples/e3_m.png)|![eligen_example_3_27](./samples/e3.png)|
97
+ |![eligen_example_4_mask_21](./samples/e4_m.png)|![eligen_example_4_21](./samples/e4.png)|
98
+ |![eligen_example_5_mask_0](./samples/e5_m.png)|![eligen_example_5_0](./samples/e5.png)|
99
+ |![eligen_example_6_mask_8](./samples/e6_m.png)|![eligen_example_6_8](./samples/e6.png)|
100
+
101
+ 1. Demonstration of the robustness of EliGen. The following examples are generated using the same prompt but different seeds. Refer to [entity_control.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control.py) `example_7` for the prompts.
102
+
103
+ |Entity Conditions|Generated Image|
104
+ |-|-|
105
+ |![eligen_example_7_mask_5](./samples/e7_m.png)|![eligen_example_7_5](./samples/e7_1.png)|
106
+ |![eligen_example_7_mask_5](./samples/e7_m.png)|![eligen_example_7_6](./samples/e7_2.png)|
107
+ ![eligen_example_7_mask_5](./samples/e7_m.png)|![eligen_example_7_7](./samples/e7_3.png)|
108
+ |![eligen_example_7_mask_5](./samples/e7_m.png)|![eligen_example_7_8](./samples/e7_4.png)|
109
+
110
+ ### Image Inpainting
111
+ Demonstration of the inpainting mode of EliGen, see [entity_inpaint.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_inpaint.py) for generation prompts.
112
+ |Inpainting Input|Inpainting Output|
113
+ |-|-|
114
+ |![inpaint_i1](./samples/inpaint_i1.jpg)|![inpaint_o1](./samples/inpaint_o1.png)|
115
+ |![inpaint_i2](./samples/inpaint_i2.png)|![inpaint_o2](./samples/inpaint_o2.png)|
116
+ ### Styled Entity Control
117
+ Demonstration of the styled entity control results with EliGen and IP-Adapter, see [entity_control_ipadapter.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_control_ipadapter.py) for generation prompts.
118
+ |Style Reference|Entity Control Variance 1|Entity Control Variance 2|Entity Control Variance 3|
119
+ |-|-|-|-|
120
+ |![ip_ref](./samples/ip_ref.png)|![ip_1](./samples/ip_1.png)|![ip_2](./samples/ip_2.png)|![ip_3](./samples/ip_3.png)|
121
+
122
+ We also provide a demo of the styled entity control results with EliGen and specific styled lora, see [./styled_entity_control.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/styled_entity_control.py) for details. Here is the visualization of EliGen with [Lego dreambooth lora](https://huggingface.co/merve/flux-lego-lora-dreambooth).
123
+ |![image_1_base](./samples/styled_entity_control_example_1_mask_0.png)|![result1](./samples/styled_entity_control_example_2_mask_0.png)|![result2](./samples/styled_entity_control_example_3_mask_27.png)|![result3](./samples/styled_entity_control_example_4_mask_21.png)|
124
+ |-|-|-|-|
125
+ |![image_1_base](./samples/styled_entity_control_example_5_mask_0.png)|![result1](./samples/styled_entity_control_example_6_mask_8.png)|![result2](./samples/styled_entity_control_example_7_mask_5.png)|![result3](./samples/styled_entity_control_example_7_mask_6.png)|
126
+
127
+ ### Entity Transfer
128
+ Demonstration of the entity transfer results with EliGen and In-Context LoRA, see [entity_transfer.py](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/EntityControl/entity_transfer.py) for generation prompts.
129
+
130
+ |Entity to Transfer|Transfer Target Image|Transfer Example 1|Transfer Example 2|
131
+ |-|-|-|-|
132
+ |![ic_logo](./samples/ic_logo.jpg)|![ic_target](./samples/ic_target.png)|![ic_1](./samples/ic_1.jpg)|![ic_2](./samples/ic_2.jpg)|
133
+
134
+ ## Citation
135
+ If you find our work helpful, feel free to give us a cite.
136
+ ```
137
+ @article{zhang2025eligen,
138
+ title={Eligen: Entity-level controlled image generation with regional attention},
139
+ author={Zhang, Hong and Duan, Zhongjie and Wang, Xingjun and Chen, Yingda and Zhang, Yu},
140
+ journal={arXiv preprint arXiv:2501.01097},
141
+ year={2025}
142
+ }
143
+ ```
configuration.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"framework":"Pytorch","task":"text-to-image-synthesis"}
model_bf16.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b2a253f6d7b40754140cf0ed48c99477057b15f44f4f3b40b4debd0e721363bf
3
+ size 612742344
samples/e1.png ADDED

Git LFS Details

  • SHA256: 2620fd9fa64bc15b436f79ed0c2729c866936a026f1a85f2bb349be17eae8d30
  • Pointer size: 132 Bytes
  • Size of remote file: 1.3 MB
samples/e1_m.png ADDED
samples/e2.png ADDED

Git LFS Details

  • SHA256: 29421c0be0cbc9a63e23982a4ce01ba22a85144e58f6b87b2e8d27ca89d70439
  • Pointer size: 132 Bytes
  • Size of remote file: 1.08 MB
samples/e2_m.png ADDED
samples/e3.png ADDED

Git LFS Details

  • SHA256: ed84a59dfbfe67160bdd892bb5017f6e76c403e3fe052a9eee07b7e6d28b4d4c
  • Pointer size: 132 Bytes
  • Size of remote file: 1.12 MB
samples/e3_m.png ADDED
samples/e4.png ADDED

Git LFS Details

  • SHA256: 240b12e3c8df976b096320d9cf33b515f797a6043c439e84a25ce39b5d5b1847
  • Pointer size: 131 Bytes
  • Size of remote file: 948 kB
samples/e4_m.png ADDED
samples/e5.png ADDED

Git LFS Details

  • SHA256: e29fba5a983b266559612004e22b4d4932bab2cd8388d86cc7c7f93a8f8c717c
  • Pointer size: 132 Bytes
  • Size of remote file: 1.52 MB
samples/e5_m.png ADDED
samples/e6.png ADDED

Git LFS Details

  • SHA256: c5f5b7a0b9536e686aadca067ad248e56b044ec24f8f73ea2a717579cd795a2e
  • Pointer size: 132 Bytes
  • Size of remote file: 1.14 MB
samples/e6_m.png ADDED
samples/e7_1.png ADDED

Git LFS Details

  • SHA256: 0b9b8fbf5c27196f5a796ef811187bdf09c54ceb9f7c488cb530f4df07f63a4c
  • Pointer size: 132 Bytes
  • Size of remote file: 1 MB
samples/e7_2.png ADDED

Git LFS Details

  • SHA256: 846ccca32b106a88346f4ce512175afac001c96f1320e7270e0102c5011e25a6
  • Pointer size: 132 Bytes
  • Size of remote file: 1.05 MB
samples/e7_3.png ADDED

Git LFS Details

  • SHA256: a6d2d30e39dc9ff12c004c28244cf4812fae811f3a2c2812d0f22c8e754ecf58
  • Pointer size: 132 Bytes
  • Size of remote file: 1.06 MB
samples/e7_4.png ADDED

Git LFS Details

  • SHA256: 022f823c282be4ec08f4754d4dc7c3ec699ac9a955e66f6ba58505e74981e417
  • Pointer size: 132 Bytes
  • Size of remote file: 1.01 MB
samples/e7_m.png ADDED
samples/ic_1.jpg ADDED
samples/ic_2.jpg ADDED
samples/ic_logo.jpg ADDED
samples/ic_target.png ADDED

Git LFS Details

  • SHA256: e2f4d5f9172794ca9e8221206d329c677d9c7ae0ef4d9489d1cdfdcb4021370f
  • Pointer size: 132 Bytes
  • Size of remote file: 1.03 MB
samples/inpaint_i1.jpg ADDED

Git LFS Details

  • SHA256: e00e863292f8a5a0ac91d0e5e9f12cbebbe3d2204634be0174b66dba6d71ef9e
  • Pointer size: 131 Bytes
  • Size of remote file: 253 kB
samples/inpaint_i2.png ADDED

Git LFS Details

  • SHA256: 16e3e85556fe42fa401faf860fc189e197b5ef095bb20ca0f1c55f594126da98
  • Pointer size: 132 Bytes
  • Size of remote file: 1.3 MB
samples/inpaint_o1.png ADDED

Git LFS Details

  • SHA256: c35efbe60dc4b2fa4eaf91c1ebf0d41dc3dea3bcc300bb791bec5efc2bd26efb
  • Pointer size: 132 Bytes
  • Size of remote file: 2.36 MB
samples/inpaint_o2.png ADDED

Git LFS Details

  • SHA256: d07ed608b2f6d81fac137056c1776e0cae9f911b1df0864199f84b0ab6137f26
  • Pointer size: 132 Bytes
  • Size of remote file: 1.13 MB
samples/ip_1.png ADDED

Git LFS Details

  • SHA256: 8e853f720d9119d2bf48d4984af494831d23cb623f6bd0b4c8b218e6553d9b20
  • Pointer size: 132 Bytes
  • Size of remote file: 1.01 MB
samples/ip_2.png ADDED

Git LFS Details

  • SHA256: a3428c3af146bd6cabbec875db9828eb68978da3fb262ea6d623d6d92171bb6e
  • Pointer size: 131 Bytes
  • Size of remote file: 983 kB
samples/ip_3.png ADDED

Git LFS Details

  • SHA256: 17365acca35f75384b6a8e3c76e9eef7795ec21d5733da36d7939840fe112dfb
  • Pointer size: 131 Bytes
  • Size of remote file: 961 kB
samples/ip_ref.png ADDED

Git LFS Details

  • SHA256: 5454d28fa063dbbab821d7604b4043a7302977d2cb8bd99938c7c727ec310ba6
  • Pointer size: 132 Bytes
  • Size of remote file: 1.19 MB
samples/regional_attention.jpg ADDED

Git LFS Details

  • SHA256: b598e87fb5f8455d66557bf0bf8822b1f74cb806179bcbaf08a7fe9dd56474b8
  • Pointer size: 131 Bytes
  • Size of remote file: 772 kB
samples/styled_entity_control_example_1_mask_0.png ADDED

Git LFS Details

  • SHA256: 41814d18fc343ed629442d4bde100cca4227654a46c6f8826ac7985e48d09b8d
  • Pointer size: 132 Bytes
  • Size of remote file: 1.02 MB
samples/styled_entity_control_example_2_mask_0.png ADDED

Git LFS Details

  • SHA256: dc5ad2314677acbc23bbba5b7b5821eec5c6ea4bc0f67fe1ef09fa3b37f02688
  • Pointer size: 132 Bytes
  • Size of remote file: 1.29 MB
samples/styled_entity_control_example_3_mask_27.png ADDED

Git LFS Details

  • SHA256: f21e95084830f72aa76b74999ced71fdfafa779ed64fb0e0804ea7505fb56895
  • Pointer size: 132 Bytes
  • Size of remote file: 1.06 MB
samples/styled_entity_control_example_4_mask_21.png ADDED

Git LFS Details

  • SHA256: 2840aa3f416790dc6ea43945ac6aaddccbee4221389a70d6a28fb7f5ffca5774
  • Pointer size: 131 Bytes
  • Size of remote file: 974 kB
samples/styled_entity_control_example_5_mask_0.png ADDED

Git LFS Details

  • SHA256: 6eafd5f362a105fd19a9d0cd7eabd8dab12a83b688d980b13beecfbd4989b8e5
  • Pointer size: 132 Bytes
  • Size of remote file: 1.55 MB
samples/styled_entity_control_example_6_mask_8.png ADDED

Git LFS Details

  • SHA256: 0c8e1914d0ea1141d9a562e9b067a67fdbfd4243ae5aed7442b370b07b034ddc
  • Pointer size: 132 Bytes
  • Size of remote file: 1.08 MB
samples/styled_entity_control_example_7_mask_5.png ADDED

Git LFS Details

  • SHA256: 84faaccdb009be53eb102b0f6c4086f5a9f201831cd41c8e571957c7b09f6443
  • Pointer size: 132 Bytes
  • Size of remote file: 1.01 MB
samples/styled_entity_control_example_7_mask_6.png ADDED

Git LFS Details

  • SHA256: 6767533b466ebbfd00e033446e2ea42d20b5564d82febc3489334bccd5ac8aef
  • Pointer size: 132 Bytes
  • Size of remote file: 1.07 MB
samples/video.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7906b41bcb415c11e6a65774e37332125f0b31fd8a0cb94ff0371e57f3323d64
3
+ size 2952060