Sorry! fixed link now! Well they support Text, image and audio while most support one or two modalities.
E.g if you make music and can embed your files to search for samples by description or sound with your mouth. Or take a drawing of a monster and search for the sound that the monster makes for creating a game. The larger models generally provide better embeddings, but the embeddings generated by gpt models like qwen3.5 are generally poor. Their latest embedding-model versions are the Qwen3-vl-embedding, but they dont have 3 modalities.
Markus PRO
AI & ML interests
Everything.
Recent Activity
replied to their post about 11 hours ago
The hidden gem of open-source embedding models: LCO-Embedding
for text, image AND audio!
I found this model after reading the recent Massive Audio Embedding Benchmark (MAEB) paper, as it blew the other models out of the water on day zero. I've been using it personally for about a week, and searching my files by describing music, sound effects or images is both practical and entertaining. Really underrated model, would highly recommend checking it out: https://huggingface.co/LCO-Embedding/LCO-Embedding-Omni-7B
PS: If you're looking you run this model on llama.cpp, i've gone ahead and quantized them for you here ๐ https://huggingface.co/collections/marksverdhei/lco-embedding-omni-gguf upvoted a paper 3 days ago
Scaling Language-Centric Omnimodal Representation Learning reacted
to
their post with ๐คฏ 3 days ago
The hidden gem of open-source embedding models: LCO-Embedding
for text, image AND audio!
I found this model after reading the recent Massive Audio Embedding Benchmark (MAEB) paper, as it blew the other models out of the water on day zero. I've been using it personally for about a week, and searching my files by describing music, sound effects or images is both practical and entertaining. Really underrated model, would highly recommend checking it out: https://huggingface.co/LCO-Embedding/LCO-Embedding-Omni-7B
PS: If you're looking you run this model on llama.cpp, i've gone ahead and quantized them for you here ๐ https://huggingface.co/collections/marksverdhei/lco-embedding-omni-gguf 
