Alibaba-NLP
/

WebSailor

Model card Files Files and versions

xet

Community

Improve model card for WebSailor

by nielsr HF Staff - opened Jul 6

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+45

-2

Files changed (1) hide show

README.md +45 -2

README.md CHANGED Viewed

@@ -1,6 +1,49 @@
 ---
 license: apache-2.0
 ---
-🚀🚀🚀coming soon
-More details are presented in https://github.com/Alibaba-NLP/WebAgent

 ---
 license: apache-2.0
+pipeline_tag: image-text-to-text
+library_name: transformers
 ---
+# WebSailor: Navigating Super-human Reasoning for Web Agent
+This repository contains **WebSailor**, a model introduced in the paper [WebSailor: Navigating Super-human Reasoning for Web Agent](https://huggingface.co/papers/2507.02592).
+**WebSailor** is a complete post-training methodology designed to instill crucial superhuman reasoning capabilities in LLMs, specifically for navigating vast information landscapes and solving extremely complex information-seeking tasks. Its success stems from a sophisticated reasoning pattern that systematically reduces extreme uncertainty. The approach involves generating novel, high-uncertainty tasks through structured sampling and information obfuscation, RFT cold start, and an efficient agentic RL training algorithm, Duplicating Sampling Policy Optimization (DUPO). This integrated pipeline allows WebSailor to significantly outperform all open-source agents in complex information-seeking tasks, matching proprietary agents' performance and closing the capability gap.
+For more details, code, and demos, please refer to the [official GitHub repository](https://github.com/Alibaba-NLP/WebAgent).
+## Features for WebSailor
+WebSailor is designed with the following key features:
+*   A complete post-training methodology enabling models to engage in extended thinking and information seeking, ultimately allowing them to successfully complete extremely complex tasks previously considered unsolvable.
+*   Introduces **SailorFog-QA**, a scalable QA benchmark with high uncertainty and difficulty, curated with a novel data synthesis method through graph sampling and information obfuscation.
+*   Effective post-training pipeline consisting of (1) high-quality reconstruction of concise reasoning from expert trajectories for clean supervision, (2) a two-stage training process involving an RFT cold start stage, followed by **Duplicating Sampling Policy Optimization (DUPO)**, an efficient agentic RL algorithm excelling in effectiveness and efficiency.
+## Performance
+WebSailor-72B significantly outperforms all open-source agents and frameworks while closing the performance gap with leading proprietary systems, achieving a score of **12.0%** on BrowseComp-en, **30.1%** on BrowseComp-zh, and **55.4%** on GAIA.
+## Usage
+The checkpoint for WebSailor is currently listed as "coming soon" in the main repository. Once released, detailed usage examples and instructions will be provided here and on the [GitHub repository](https://github.com/Alibaba-NLP/WebAgent).
+## Demos
+WebSailor provides compelling video demonstrations for BrowseComp-en, BrowseComp-zh, and Daily Use scenarios, showcasing its ability to complete highly difficult and uncertain tasks requiring massive information acquisition and complex reasoning. These demos are available on the [project's GitHub repository](https://github.com/Alibaba-NLP/WebAgent#%EF%B8%8F-websailor-demos).
+## Citation
+If this work is helpful, please kindly cite as:
+```bibtex
+@misc{li2025websailor,
+      title={WebSailor: Navigating Super-human Reasoning for Web Agent},
+      author={Kuan Li and Zhongwang Zhang and Huifeng Yin and Liwen Zhang and Litu Ou and Jialong Wu and Wenbiao Yin and Baixuan Li and Zhengwei Tao and Xinyu Wang and Weizhou Shen and Junkai Zhang and Dingchu Zhang and Xixi Wu and Yong Jiang and Ming Yan and Pengjun Xie and Fei Huang and Jingren Zhou},
+      year={2025},
+      eprint={2507.02592},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2507.02592},
+}
+```