Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Kaiyan Zhang's picture
16 22 36

Kaiyan Zhang

iseesaw
LighterDarkness's profile picture lindsay-qu's profile picture Enderchef's profile picture
·
  • iseesaw

AI & ML interests

None yet

Recent Activity

replied to Jaward's post 3 days ago
fascinating read! staying bullish on search with rl might just help us get rid of hallucination entirely. I really like their approach: 1) <think>on prompt/context && what u know </think> 2) self <search>when u don’t know</search> (iteratively) with no external tool 3) <information>cite sources to support claim(s)</information> 4) <answer>final answer</answer> their rl training was done cost efficiently too, see code: https://github.com/TsinghuaC3I/SSRL
reacted to Jaward's post with 🚀 3 days ago
fascinating read! staying bullish on search with rl might just help us get rid of hallucination entirely. I really like their approach: 1) <think>on prompt/context && what u know </think> 2) self <search>when u don’t know</search> (iteratively) with no external tool 3) <information>cite sources to support claim(s)</information> 4) <answer>final answer</answer> their rl training was done cost efficiently too, see code: https://github.com/TsinghuaC3I/SSRL
authored a paper 4 days ago
SSRL: Self-Search Reinforcement Learning
View all activity

Organizations

TsinghuaC3I's profile picture

Papers 23

arxiv:2508.10874
arxiv:2508.10308
arxiv:2504.16084
arxiv:2504.00891

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs