Dhruv PRO
dhruv3006
AI & ML interests
None yet
Recent Activity
published
an
article
6 days ago
Moondream3 and Salesforce GTA-1 for UI grounding in computer-use agents
reacted
to
their
post
with ๐
6 days ago
Computer Use with Sonnet 4.5
We ran one of our hardest computer-use benchmarks on Anthropic Sonnet 4.5, side-by-side with Sonnet 4.
Ask: "Install LibreOffice and make a sales table".
Sonnet 4.5: 214 turns, clean trajectory
Sonnet 4: 316 turns, major detours
The difference shows up in multi-step sequences where errors compound.
32% efficiency gain in just 2 months. From struggling with file extraction to executing complex workflows end-to-end. Computer-use agents are improving faster than most people realize.
Anthropic Sonnet 4.5 and the most comprehensive catalog of VLMs for computer-use are available in our open-source framework.
Start building: https://github.com/trycua/cua
reacted
to
their
post
with ๐ฅ
6 days ago
Computer Use with Sonnet 4.5
We ran one of our hardest computer-use benchmarks on Anthropic Sonnet 4.5, side-by-side with Sonnet 4.
Ask: "Install LibreOffice and make a sales table".
Sonnet 4.5: 214 turns, clean trajectory
Sonnet 4: 316 turns, major detours
The difference shows up in multi-step sequences where errors compound.
32% efficiency gain in just 2 months. From struggling with file extraction to executing complex workflows end-to-end. Computer-use agents are improving faster than most people realize.
Anthropic Sonnet 4.5 and the most comprehensive catalog of VLMs for computer-use are available in our open-source framework.
Start building: https://github.com/trycua/cua