Improve model card: Add pipeline tag, links, abstract, features, and usage instructions

#2
by nielsr HF Staff - opened

This PR significantly enhances the model card for Ovi, an 8-bit quantized audio-video generation model.

Key improvements include:

  • Adding the pipeline_tag: any-to-any to accurately reflect its multimodal generative capabilities, making it more discoverable on the Hugging Face Hub.
  • Expanding the model description by including the full paper abstract and "Key Features" from the GitHub README.
  • Providing direct links to the paper (Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation), the project page (https://aaxwaz.github.io/Ovi), the GitHub repository (https://github.com/character-ai/Ovi), and a Hugging Face Space demo.
  • Including a detailed "Quick Start" section with installation steps, weight download instructions, configuration options, prompt formatting, and command-line usage examples for single GPU, multi-GPU inference, and Gradio, all directly sourced from the official GitHub repository.
  • Embedding the video demo from the GitHub README.
  • Adding Acknowledgements and Citation sections for proper academic practice.

These updates ensure the model card is informative, user-friendly, and compliant with best practices for documenting AI artifacts.

rkfg changed pull request status to merged
Owner

Thank you! Didn't get a notification to e-mail for some reason and only now noticed this PR.

Sign up or log in to comment