Update model card for CodeGoat24/UnifiedReward-Think-qwen-7b (Pref-GRPO reward model)
#1
by
nielsr
HF Staff
- opened
This PR enhances the model card for CodeGoat24/UnifiedReward-Think-qwen-7b to reflect its role as the pairwise preference reward model for the Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning paper.
It includes the following improvements:
- Updates the model summary to clarify the model's association with the Pref-GRPO paper.
- Links directly to the Pref-GRPO paper on Hugging Face.
- Updates the project page link to the Pref-GRPO specific page.
- Adds a link to the Pref-GRPO GitHub repository.
- Adds the
pipeline_tag: image-text-to-textto improve discoverability on the Hub for multimodal evaluation tasks. - Adds the
library_name: transformerstag to enable the automated "How to use" code snippet. - Corrects the "Quick Start" code snippet by adding the missing
import requests. - Updates the citation block to reference the Pref-GRPO paper.
Please review and merge this PR if everything looks good.