Update README.md
Browse files
README.md
CHANGED
@@ -17,15 +17,15 @@ tags:
|
|
17 |
|
18 |
## Model Description
|
19 |
|
20 |
-
|
21 |
|
22 |
-
As part of a broader agentic architecture,
|
23 |
|
24 |
-
Trained on a mix of open-access, synthetic, and self-generated data,
|
25 |
It also excels in UI localization tasks such as [Screenspot](https://huggingface.co/datasets/rootsautomation/ScreenSpot), [Screenspot-V2](https://huggingface.co/datasets/HongxinLi/ScreenSpot_v2), [Screenspot-Pro](https://huggingface.co/datasets/likaixin/ScreenSpot-Pro), [GroundUI-Web](https://huggingface.co/datasets/agent-studio/GroundUI-1K), and our own newly introduced
|
26 |
benchmark [WebClick](https://huggingface.co/datasets/Hcompany/WebClick).
|
27 |
|
28 |
-
|
29 |
|
30 |
For more details, check our paper and our blog post.
|
31 |
|
@@ -86,7 +86,7 @@ We also provide code to reproduce screenspot evaluations: screenspot_eval.py
|
|
86 |
|
87 |
### Prepare model, processor
|
88 |
|
89 |
-
|
90 |
You can load the model and the processor as follows:
|
91 |
|
92 |
```python
|
|
|
17 |
|
18 |
## Model Description
|
19 |
|
20 |
+
Holo1 is an Action Vision-Language Model (VLM) developed by [HCompany](https://www.hcompany.ai/) for use in the Surfer-H web agent system. It is designed to interact with web interfaces like a human user.
|
21 |
|
22 |
+
As part of a broader agentic architecture, Holo1 acts as a policy, localizer, or validator, helping the agent understand and act in digital environments.
|
23 |
|
24 |
+
Trained on a mix of open-access, synthetic, and self-generated data, Holo1 enables state-of-the-art (SOTA) performance on the [WebVoyager](https://arxiv.org/pdf/2401.13919) benchmark, offering the best accuracy/cost tradeoff among current models.
|
25 |
It also excels in UI localization tasks such as [Screenspot](https://huggingface.co/datasets/rootsautomation/ScreenSpot), [Screenspot-V2](https://huggingface.co/datasets/HongxinLi/ScreenSpot_v2), [Screenspot-Pro](https://huggingface.co/datasets/likaixin/ScreenSpot-Pro), [GroundUI-Web](https://huggingface.co/datasets/agent-studio/GroundUI-1K), and our own newly introduced
|
26 |
benchmark [WebClick](https://huggingface.co/datasets/Hcompany/WebClick).
|
27 |
|
28 |
+
Holo1 is optimized for both accuracy and cost-efficiency, making it a strong open-source alternative to existing VLMs.
|
29 |
|
30 |
For more details, check our paper and our blog post.
|
31 |
|
|
|
86 |
|
87 |
### Prepare model, processor
|
88 |
|
89 |
+
Holo1 models are based on Qwen2.5-VL architecture, which comes with transformers support. Here we provide a simple usage example.
|
90 |
You can load the model and the processor as follows:
|
91 |
|
92 |
```python
|