Text-to-Image
hkchengrex's picture
Update README.md
c7f4a78 verified
metadata
datasets:
  - ILSVRC/imagenet-1k
  - visual-layer/imagenet-1k-vl-enriched
license: mit
pipeline_tag: text-to-image

The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

This model is trained on ImageNet-32x32 text-to-image with conditional minibatch optimal transport data coupling.

Paper: The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation Project Page: https://hkchengrex.github.io/C2OT Code: https://github.com/hkchengrex/C2OT

8GtoMoons

High-Level Summary

C2OT is an algorithm for computing prior-to-data couplings for flow-matching-based generative models during training. Our goal is to achieve straighter flows, enabled by optimal transport (OT) couplings, while mitigating the test-time degradation that OT encounters in the conditional setting. The key idea is that OT samples from a condition-skewed prior distribution at test time, whereas C2OT unskews the prior by incorporating a condition-dependent term into the OT cost.

Usage

See the GitHub repo: https://github.com/hkchengrex/C2OT

Citation

If you use C$^2$OT in your research, please cite the original paper:

@inproceedings{cheng2025curse,
  title={The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation},
  author={Cheng, Ho Kei and Schwing, Alexander},
  booktitle={ICCV},
  year={2025}
}