plcedoz38 commited on
Commit
01d8e16
·
verified ·
1 Parent(s): 3ba87c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -17,15 +17,15 @@ tags:
17
 
18
  ## Model Description
19
 
20
- Holo-1 is an Action Vision-Language Model (VLM) developed by [HCompany](https://www.hcompany.ai/) for use in the Surfer-H web agent system. It is designed to interact with web interfaces like a human user.
21
 
22
- As part of a broader agentic architecture, Holo-1 acts as a policy, localizer, or validator, helping the agent understand and act in digital environments.
23
 
24
- Trained on a mix of open-access, synthetic, and self-generated data, Holo-1 enables state-of-the-art (SOTA) performance on the [WebVoyager](https://arxiv.org/pdf/2401.13919) benchmark, offering the best accuracy/cost tradeoff among current models.
25
  It also excels in UI localization tasks such as [Screenspot](https://huggingface.co/datasets/rootsautomation/ScreenSpot), [Screenspot-V2](https://huggingface.co/datasets/HongxinLi/ScreenSpot_v2), [Screenspot-Pro](https://huggingface.co/datasets/likaixin/ScreenSpot-Pro), [GroundUI-Web](https://huggingface.co/datasets/agent-studio/GroundUI-1K), and our own newly introduced
26
  benchmark [WebClick](https://huggingface.co/datasets/Hcompany/WebClick).
27
 
28
- Holo-1 is optimized for both accuracy and cost-efficiency, making it a strong open-source alternative to existing VLMs.
29
 
30
  For more details, check our paper and our blog post.
31
 
@@ -86,7 +86,7 @@ We also provide code to reproduce screenspot evaluations: screenspot_eval.py
86
 
87
  ### Prepare model, processor
88
 
89
- Holo-1 models are based on Qwen2.5-VL architecture, which comes with transformers support. Here we provide a simple usage example.
90
  You can load the model and the processor as follows:
91
 
92
  ```python
 
17
 
18
  ## Model Description
19
 
20
+ Holo1 is an Action Vision-Language Model (VLM) developed by [HCompany](https://www.hcompany.ai/) for use in the Surfer-H web agent system. It is designed to interact with web interfaces like a human user.
21
 
22
+ As part of a broader agentic architecture, Holo1 acts as a policy, localizer, or validator, helping the agent understand and act in digital environments.
23
 
24
+ Trained on a mix of open-access, synthetic, and self-generated data, Holo1 enables state-of-the-art (SOTA) performance on the [WebVoyager](https://arxiv.org/pdf/2401.13919) benchmark, offering the best accuracy/cost tradeoff among current models.
25
  It also excels in UI localization tasks such as [Screenspot](https://huggingface.co/datasets/rootsautomation/ScreenSpot), [Screenspot-V2](https://huggingface.co/datasets/HongxinLi/ScreenSpot_v2), [Screenspot-Pro](https://huggingface.co/datasets/likaixin/ScreenSpot-Pro), [GroundUI-Web](https://huggingface.co/datasets/agent-studio/GroundUI-1K), and our own newly introduced
26
  benchmark [WebClick](https://huggingface.co/datasets/Hcompany/WebClick).
27
 
28
+ Holo1 is optimized for both accuracy and cost-efficiency, making it a strong open-source alternative to existing VLMs.
29
 
30
  For more details, check our paper and our blog post.
31
 
 
86
 
87
  ### Prepare model, processor
88
 
89
+ Holo1 models are based on Qwen2.5-VL architecture, which comes with transformers support. Here we provide a simple usage example.
90
  You can load the model and the processor as follows:
91
 
92
  ```python