Improve model card: Add pipeline tag, library name, and official links

This PR enhances the model card by:

- Adding the `pipeline_tag: image-text-to-text` to better categorize the model's functionality (processing images and text to generate text) and improve discoverability on the Hugging Face Hub.
- Specifying `library_name: transformers` to enable direct usage via the Hugging Face `transformers` library and showcase an automated "how to use" widget on the model page.
- Updating the paper link to point to the official Hugging Face Papers page: [Mobile-Agent-v3: Foundamental Agents for GUI Automation](https://huggingface.co/papers/2508.15144).
- Adding a link to the project page: `https://osatlas.github.io/` for more context.

These changes will improve the model's discoverability and usability on the Hugging Face Hub.

Files changed (1) hide show

README.md +14 -12

README.md CHANGED Viewed

@@ -1,9 +1,11 @@
 ---
-license: mit
-language:
-- en
 base_model:
 - Qwen/Qwen2.5-VL-32B-Instruct
 ---
 # GUI-Owl
@@ -14,9 +16,10 @@ base_model:
 GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks.
-* **Paper**: [Paper Link](https://github.com/X-PLUG/MobileAgent/blob/main/Mobile-Agent-v3/assets/MobileAgentV3_Tech.pdf)
-* **GitHub Repository**: https://github.com/X-PLUG/MobileAgent
-* **Online Demo**: Comming soon
 ## Performance
@@ -70,10 +73,10 @@ MM_KWARGS=(
     --limit-mm-per-prompt $IMAGE_LIMIT_ARGS
 )
-vllm serve $CKPT \
-    --max-model-len 32768 ${MM_KWARGS[@]} \
-    --tensor-parallel-size $MP_SIZE \
-    --allowed-local-media-path '/' \
     --port 4243
 ```
@@ -89,5 +92,4 @@ If you find our paper and model useful in your research, feel free to give us a
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2508.15144},
 }
-```

 ---
 base_model:
 - Qwen/Qwen2.5-VL-32B-Instruct
+language:
+- en
+license: mit
+pipeline_tag: image-text-to-text
+library_name: transformers
 ---
 # GUI-Owl
 GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks.
+*   **Paper**: [Mobile-Agent-v3: Foundamental Agents for GUI Automation](https://huggingface.co/papers/2508.15144)
+*   **Project Page**: https://osatlas.github.io/
+*   **GitHub Repository**: https://github.com/X-PLUG/MobileAgent
+*   **Online Demo**: Comming soon
 ## Performance
     --limit-mm-per-prompt $IMAGE_LIMIT_ARGS
 )
+vllm serve $CKPT \\\
+    --max-model-len 32768 ${MM_KWARGS[@]} \\\
+    --tensor-parallel-size $MP_SIZE \\\
+    --allowed-local-media-path '/' \\\
     --port 4243
 ```
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2508.15144},
 }
+```