microsoft
/

GUI-Actor-2B-Qwen2-VL

@@ -1,15 +1,17 @@
 ---
-license: mit
 base_model:
 - Qwen/Qwen2-VL-2B-Instruct
 ---
 # GUI-Actor-2B with Qwen2-VL-2B as backbone VLM
-This model was introduced in the paper [**GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents**](https://aka.ms/GUI-Actor).
 It is developed based on [Qwen2-VL-2B-Instruct ](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct), augmented by an attention-based action head and finetuned to perform GUI grounding using the dataset [here (coming soon)]().
-For more details on model design and evaluation, please check: [🏠 Project Page](https://aka.ms/GUI-Actor) | [💻 Github Repo](https://github.com/microsoft/GUI-Actor) | [📑 Paper](https://www.arxiv.org/pdf/2506.03143).
 | Model Name                                  | Hugging Face Link                         |
 |--------------------------------------------|--------------------------------------------|

 ---
 base_model:
 - Qwen/Qwen2-VL-2B-Instruct
+license: mit
+library_name: transformers
+pipeline_tag: image-text-to-text
 ---
 # GUI-Actor-2B with Qwen2-VL-2B as backbone VLM
+This model was introduced in the paper [GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents](https://www.arxiv.org/pdf/2506.03143).
 It is developed based on [Qwen2-VL-2B-Instruct ](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct), augmented by an attention-based action head and finetuned to perform GUI grounding using the dataset [here (coming soon)]().
+For more details on model design and evaluation, please check: [🏠 Project Page](https://microsoft.github.io/GUI-Actor/) | [💻 Github Repo](https://github.com/microsoft/GUI-Actor) | [📑 Paper](https://www.arxiv.org/pdf/2506.03143).
 | Model Name                                  | Hugging Face Link                         |
 |--------------------------------------------|--------------------------------------------|