忍者's picture

133 320

忍者

byteprobe

·

AI & ML interests

RL | NLP | LLM | LMM | agent

Recent Activity

upvoted a paper about 22 hours ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

upvoted a paper 1 day ago

Reasoning Language Models: A Blueprint

upvoted a paper 1 day ago

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

View all activity

Organizations

byteprobe's activity

upvoted a paper about 22 hours ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 84

upvoted 4 papers 1 day ago

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published 11 days ago • 30

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Paper • 2501.12326 • Published 9 days ago • 47

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published 10 days ago • 84

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 9 days ago • 76

upvoted a collection 1 day ago

Llama 3.3

This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 124

upvoted 3 papers 1 day ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 8 days ago • 270

Humanity's Last Exam

Paper • 2501.14249 • Published 7 days ago • 45

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 5 days ago • 39

liked a Space 1 day ago

Qwen2.5 Max Demo

liked 2 models 1 day ago

Qwen/Qwen2.5-14B-Instruct-1M

Text Generation • Updated 1 day ago • 3.85k • 185

deepseek-ai/Janus-Pro-7B

Any-to-Any • Updated 3 days ago • 79.3k • 2.07k

upvoted 2 collections 1 day ago

DeepSeek-V3

3 items • Updated 25 days ago • 158

DeepSeek-R1

8 items • Updated 10 days ago • 301

liked a model 1 day ago

Qwen/Qwen2.5-7B-Instruct-1M

Text Generation • Updated 1 day ago • 15.2k • 147

upvoted 2 collections 1 day ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 4 days ago • 87

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 4 days ago • 286

liked a model 5 days ago

deepseek-ai/DeepSeek-R1-Distill-Llama-70B

Text Generation • Updated 4 days ago • 92.7k • 382

liked 2 datasets 6 days ago

TIGER-Lab/OmniEdit-Filtered-1.2M

Viewer • Updated Dec 6, 2024 • 1.2M • 9.68k • 66

nvidia/AceMath-Instruct-Training-Data

Viewer • Updated 13 days ago • 5.56M • 2.74k • 38