-
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 122 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Self-Taught Self-Correction for Small Language Models
Paper • 2503.08681 • Published • 15
Shreyas S K
skshreyas714
·
AI & ML interests
NLP, NLU, NLI
Organizations
Read-up research papers
-
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 122 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Self-Taught Self-Correction for Small Language Models
Paper • 2503.08681 • Published • 15
models
5

skshreyas714/AAPL_Team_ACB
Text Generation
•
4B
•
Updated
•
11

skshreyas714/qwen2.5-3B-8bit
Updated

skshreyas714/prompt-guard-finetuned
Text Classification
•
0.3B
•
Updated
•
3
•
1

skshreyas714/bge-m3-onnx
Feature Extraction
•
Updated

skshreyas714/lora-trained-xl-colab
Text-to-Image
•
Updated
•
5
•
1