🔥 Why it’s cool: - Achieves high-quality, multi-task image editing. - Uses only 1% of the training parameters and 0.1% of the training data compared to existing methods — extremely efficient - Beats several commercial models on background preservation, ID control, and consistency - Open-source, low-cost, faster, and stronger — think of it as the “DeepSeek of image editing” 👀
We also implemented a Gradio demo app, available directly in our GitHub repo! And we made a flashy demo video — happy to send it your way!
MiniCPM-V 4.5 🚀 New MLLM for image, multi-image & video understanding, running even on your phone, released by OpenBMB openbmb/MiniCPM-V-4_5
✨ SOTA vision language capability ✨ 96× video token compression > high-FPS & long video reasoning ✨ Switchable fast vs deep thinking modes ✨ Strong OCR, document parsing, supports 30+ languages