Bofei Zhang PRO

Bofeee5675

https://bofei5675.github.io/

AI & ML interests

Vision Language Model & Agentic Task & Computer-Use

Recent Activity

liked a dataset 10 days ago

xlangai/Jedi

liked a dataset 19 days ago

HuggingFaceM4/the_cauldron

updated a dataset 24 days ago

Bofeee5675/GUI-Net-Benchmark

View all activity

Organizations

liked a dataset 10 days ago

xlangai/Jedi

Preview • Updated Jun 27 • 420 • 13

liked a dataset 19 days ago

HuggingFaceM4/the_cauldron

Viewer • Updated May 6, 2024 • 1.88M • 27.4k • 484

updated a dataset 24 days ago

Bofeee5675/GUI-Net-Benchmark

Preview • Updated 24 days ago • 36 • 2

updated a dataset 25 days ago

Bofeee5675/GUI-Net-Crawler

Updated 25 days ago • 162 • 2

upvoted a paper 26 days ago

TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models

Paper • 2506.03099 • Published Jun 3 • 14

updated a model about 1 month ago

Bofeee5675/GUI-Net-Crawler

Updated Jul 1

published a model about 1 month ago

Bofeee5675/GUI-Net-Crawler

Updated Jul 1

liked a dataset about 1 month ago

Bofeee5675/GUI-Net-Crawler

Updated 25 days ago • 162 • 2

updated a collection about 1 month ago

TongUI

Collection

Open source our work TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials; https://github.com/TongUI-agent/TongUI-agent • 10 items • Updated Jul 1 • 3

published a dataset about 1 month ago

Bofeee5675/GUI-Net-Crawler

Updated 25 days ago • 162 • 2

updated 2 models about 1 month ago

Bofeee5675/TongUI-7B

8B • Updated Jun 30 • 238 • 3

Bofeee5675/TongUI-3B

4B • Updated Jun 26 • 766 • 2

liked a model about 2 months ago

PengxiangLi/MAT-Qwen2VL-7B-Lora

Updated Mar 20 • 4 • 1

authored 4 papers about 2 months ago

FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models

Paper • 2407.11522 • Published Jul 16, 2024 • 9

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

Paper • 2412.15606 • Published Dec 20, 2024 • 2

TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials

Paper • 2504.12679 • Published Apr 17 • 1

Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL

Paper • 2505.15436 • Published May 21 • 1

updated a collection about 2 months ago

TongUI

Collection

Open source our work TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials; https://github.com/TongUI-agent/TongUI-agent • 10 items • Updated Jul 1 • 3

upvoted a paper about 2 months ago

TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials

Paper • 2504.12679 • Published Apr 17 • 1

liked a Space about 2 months ago

451

AI Deadlines

⚡

Manage project deadlines efficiently

Bofei Zhang PRO

AI & ML interests

Recent Activity

Organizations

Bofeee5675's activity

AI Deadlines