Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
16
22
36
Kaiyan Zhang
iseesaw
Follow
LighterDarkness's profile picture
lindsay-qu's profile picture
Enderchef's profile picture
5 followers
·
3 following
iseesaw
AI & ML interests
None yet
Recent Activity
replied
to
Jaward
's
post
3 days ago
fascinating read! staying bullish on search with rl might just help us get rid of hallucination entirely. I really like their approach: 1) <think>on prompt/context && what u know </think> 2) self <search>when u don’t know</search> (iteratively) with no external tool 3) <information>cite sources to support claim(s)</information> 4) <answer>final answer</answer> their rl training was done cost efficiently too, see code: https://github.com/TsinghuaC3I/SSRL
reacted
to
Jaward
's
post
with 🚀
3 days ago
fascinating read! staying bullish on search with rl might just help us get rid of hallucination entirely. I really like their approach: 1) <think>on prompt/context && what u know </think> 2) self <search>when u don’t know</search> (iteratively) with no external tool 3) <information>cite sources to support claim(s)</information> 4) <answer>final answer</answer> their rl training was done cost efficiently too, see code: https://github.com/TsinghuaC3I/SSRL
authored
a paper
4 days ago
SSRL: Self-Search Reinforcement Learning
View all activity
Organizations
Papers
23
arxiv:
2508.10874
arxiv:
2508.10308
arxiv:
2504.16084
arxiv:
2504.00891
Expand 23 papers
models
0
None public yet
datasets
0
None public yet