Update README.md
#1
by
						
sjtuzc
	
							
						- opened
							
					
    	
        README.md
    CHANGED
    
    | @@ -1,3 +1,54 @@ | |
| 1 | 
            -
            ---
         | 
| 2 | 
            -
             | 
| 3 | 
            -
             | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            library_name: transformers
         | 
| 3 | 
            +
            license: apache-2.0
         | 
| 4 | 
            +
            pipeline_tag: image-text-to-text
         | 
| 5 | 
            +
            tags:
         | 
| 6 | 
            +
            - multimodal
         | 
| 7 | 
            +
            - gui
         | 
| 8 | 
            +
            ---
         | 
| 9 | 
            +
             | 
| 10 | 
            +
            # MobiMind-Mixed-7B Model
         | 
| 11 | 
            +
             | 
| 12 | 
            +
            This is the Mixed Model of [MobiAgent](https://github.com/IPADS-SAI/MobiAgent) with 7B parameters, having the abilities of both the [MobiMind-Decider](https://huggingface.co/IPADS-SAI/MobiMind-Decider-7B/) and the [MobiMind-Grounder](https://huggingface.co/IPADS-SAI/MobiMind-Grounder-3B/) presented in the paper [MobiAgent: A Systematic Framework for Customizable Mobile Agents](https://huggingface.co/papers/2509.00531).
         | 
| 13 | 
            +
             | 
| 14 | 
            +
            ## Paper Abstract
         | 
| 15 | 
            +
             | 
| 16 | 
            +
            With the rapid advancement of Vision-Language Models (VLMs), GUI-based mobile agents have emerged as a key development direction for intelligent mobile systems. However, existing agent models continue to face significant challenges in real-world task execution, particularly in terms of accuracy and efficiency. To address these limitations, we propose MobiAgent, a comprehensive mobile agent system comprising three core components: the MobiMind-series agent models, the AgentRR acceleration framework, and the MobiFlow benchmarking suite. Furthermore, recognizing that the capabilities of current mobile agents are still limited by the availability of high-quality data, we have developed an AI-assisted agile data collection pipeline that significantly reduces the cost of manual annotation. Compared to both general-purpose LLMs and specialized GUI agent models, MobiAgent achieves state-of-the-art performance in real-world mobile scenarios.
         | 
| 17 | 
            +
             | 
| 18 | 
            +
            ## About MobiAgent
         | 
| 19 | 
            +
             | 
| 20 | 
            +
            **MobiAgent** is a powerful mobile agent system including:
         | 
| 21 | 
            +
             | 
| 22 | 
            +
            *   **An agent model family**: MobiMind
         | 
| 23 | 
            +
            *   **An agent acceleration framework**: AgentRR
         | 
| 24 | 
            +
            *   **An agent benchmark**: MobiFlow
         | 
| 25 | 
            +
             | 
| 26 | 
            +
            **System Architecture:**
         | 
| 27 | 
            +
             | 
| 28 | 
            +
            <div align="center">
         | 
| 29 | 
            +
            <p align="center">
         | 
| 30 | 
            +
              <img src="https://raw.githubusercontent.com/IPADS-SAI/MobiAgent/main/assets/arch.png" width="100%"/>
         | 
| 31 | 
            +
            </p>
         | 
| 32 | 
            +
            </div>
         | 
| 33 | 
            +
             | 
| 34 | 
            +
            ## Evaluation Results
         | 
| 35 | 
            +
             | 
| 36 | 
            +
            <table>
         | 
| 37 | 
            +
              <tr>
         | 
| 38 | 
            +
                <td><img src="https://raw.githubusercontent.com/IPADS-SAI/MobiAgent/main/assets/result1.png" width="100%"/></td>
         | 
| 39 | 
            +
                <td><img src="https://raw.githubusercontent.com/IPADS-SAI/MobiAgent/main/assets/result2.png" width="100%"/></td>
         | 
| 40 | 
            +
                <td><img src="https://raw.githubusercontent.com/IPADS-SAI/MobiAgent/main/assets/result3.png" width="100%"/></td>
         | 
| 41 | 
            +
              </tr>
         | 
| 42 | 
            +
            </table>
         | 
| 43 | 
            +
             | 
| 44 | 
            +
            ## Usage
         | 
| 45 | 
            +
             | 
| 46 | 
            +
            Deploy model inference service with vLLM:
         | 
| 47 | 
            +
             | 
| 48 | 
            +
            ```bash
         | 
| 49 | 
            +
            vllm serve IPADS-SAI/MobiMind-Mixed-7B
         | 
| 50 | 
            +
            ```
         | 
| 51 | 
            +
             | 
| 52 | 
            +
            It simultaneously serves as the Decider and the Grounder, i.e., the requests for **both tasks** can be routed to this model.
         | 
| 53 | 
            +
             | 
| 54 | 
            +
            For more usage details, e.g., execute GUI tasks with ADB or our Android App, please refer to our [repo](https://github.com/IPADS-SAI/MobiAgent)!
         | 
