-
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 137 -
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Paper • 2504.05118 • Published • 25 -
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Paper • 2504.08600 • Published • 30 -
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce
Paper • 2504.11343 • Published • 19
Sugato Ray PRO
sugatoray
AI & ML interests
None yet
Recent Activity
updated
a collection
about 3 hours ago
LLMs-EmbeddingModels
liked
a model
about 3 hours ago
ibm-granite/granite-embedding-english-r2
upvoted
an
article
about 16 hours ago
From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels
Organizations
RLMs (Reasoning Language Models)
-
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
Paper • 2503.00735 • Published • 23 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 114 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning
Paper • 2503.05379 • Published • 39
Reasoning Datasets
Document AI
LLMs
Collection of LLMs
AV LLMs
A collection of Audio, Video and Visual LLMs.
Papers
Large Language Model (LLM) and NLP related papers.
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 22 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 70
Papers-MoE
Papers on Mixture of Experts (MoE)
-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 44 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 29 -
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper • 2401.15947 • Published • 54 -
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
Paper • 2403.03432 • Published • 1
LLM LLAMA3
-
meta-llama/Meta-Llama-3-8B
Text Generation • 8B • Updated • 1.02M • • 6.28k -
meta-llama/Meta-Llama-3-8B-Instruct
Text Generation • 8B • Updated • 1.04M • • 4.14k -
mlx-community/Meta-Llama-3-8B-Instruct-4bit
Text Generation • 2B • Updated • 3.98k • 78 -
mlabonne/Meta-Llama-3-120B-Instruct
Text Generation • 122B • Updated • 14 • 201
TFM: TimeSeries Foundation Models
LLMs-EmbeddingModels
Select Embedding Models Collection.
LLM + Datasets : Finance
Marimo
-
Running1212
marimo app template
🍃Template for deploying a marimo application to HF
-
Sleeping22
Bulk
🍃A bulk labelling interface for binary text classification
-
Running55
marimo server template
📝A marimo Space to edit marimo notebooks
-
Running88
Fast-Bulk
🍃A bulk labelling interface for binary text classification
Books And Notes
SmolAgents Tools (Spaces)
Bookmark::Models
-
madhurjindal/autonlp-Gibberish-Detector-492513457
Text Classification • 0.1B • Updated • 223k • • 62 -
answerdotai/ModernBERT-base
Fill-Mask • 0.1B • Updated • 938k • 920 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 154 -
answerdotai/ModernBERT-large
Fill-Mask • 0.4B • Updated • 111k • 419
LLM Tools
A collection of tools as various HF Spaces on LLMs.
LLM Training Datasets
A collection of datasets for training LLMs.
Leaderboards 🔥
A collection of Leaderboards for LLMs ⚡️⚖️ 🤗
-
Running4.59k4.59k
LMArena Leaderboard
🏆Display LMArena Leaderboard
-
Running on CPU Upgrade13.4k13.4k
Open LLM Leaderboard
🏆Track, rank and evaluate open LLMs and chatbots
-
Running190190
Yet Another LLM Leaderboard
🌖Run a Streamlit web app
-
Runtime error144144
Hallucinations Leaderboard
🔥View and submit LLM evaluations
Papers-LLMEval
-
Latxa: An Open Language Model and Evaluation Suite for Basque
Paper • 2403.20266 • Published • 3 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 70 -
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Paper • 2405.01535 • Published • 124 -
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Paper • 2405.08707 • Published • 33
Papers-Fundamentals
-
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper • 2104.09864 • Published • 14 -
Attention Is All You Need
Paper • 1706.03762 • Published • 79 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 62 -
Zero-Shot Tokenizer Transfer
Paper • 2405.07883 • Published • 5
Papers-Benchmarks
-
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Paper • 2406.08587 • Published • 16 -
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning
Paper • 2406.09170 • Published • 28 -
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Paper • 2407.18901 • Published • 35 -
Benchmarking Agentic Workflow Generation
Paper • 2410.07869 • Published • 28
LLMs + Mamba
Papers + RL/Reasoning
-
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 137 -
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Paper • 2504.05118 • Published • 25 -
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Paper • 2504.08600 • Published • 30 -
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce
Paper • 2504.11343 • Published • 19
Marimo
-
Running1212
marimo app template
🍃Template for deploying a marimo application to HF
-
Sleeping22
Bulk
🍃A bulk labelling interface for binary text classification
-
Running55
marimo server template
📝A marimo Space to edit marimo notebooks
-
Running88
Fast-Bulk
🍃A bulk labelling interface for binary text classification
RLMs (Reasoning Language Models)
-
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
Paper • 2503.00735 • Published • 23 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 114 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning
Paper • 2503.05379 • Published • 39
Books And Notes
Reasoning Datasets
SmolAgents Tools (Spaces)
Document AI
Bookmark::Models
-
madhurjindal/autonlp-Gibberish-Detector-492513457
Text Classification • 0.1B • Updated • 223k • • 62 -
answerdotai/ModernBERT-base
Fill-Mask • 0.1B • Updated • 938k • 920 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 154 -
answerdotai/ModernBERT-large
Fill-Mask • 0.4B • Updated • 111k • 419
LLMs
Collection of LLMs
LLM Tools
A collection of tools as various HF Spaces on LLMs.
AV LLMs
A collection of Audio, Video and Visual LLMs.
LLM Training Datasets
A collection of datasets for training LLMs.
Papers
Large Language Model (LLM) and NLP related papers.
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 22 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 70
Leaderboards 🔥
A collection of Leaderboards for LLMs ⚡️⚖️ 🤗
-
Running4.59k4.59k
LMArena Leaderboard
🏆Display LMArena Leaderboard
-
Running on CPU Upgrade13.4k13.4k
Open LLM Leaderboard
🏆Track, rank and evaluate open LLMs and chatbots
-
Running190190
Yet Another LLM Leaderboard
🌖Run a Streamlit web app
-
Runtime error144144
Hallucinations Leaderboard
🔥View and submit LLM evaluations
Papers-MoE
Papers on Mixture of Experts (MoE)
-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 44 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 29 -
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper • 2401.15947 • Published • 54 -
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
Paper • 2403.03432 • Published • 1
Papers-LLMEval
-
Latxa: An Open Language Model and Evaluation Suite for Basque
Paper • 2403.20266 • Published • 3 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 70 -
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Paper • 2405.01535 • Published • 124 -
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Paper • 2405.08707 • Published • 33
LLM LLAMA3
-
meta-llama/Meta-Llama-3-8B
Text Generation • 8B • Updated • 1.02M • • 6.28k -
meta-llama/Meta-Llama-3-8B-Instruct
Text Generation • 8B • Updated • 1.04M • • 4.14k -
mlx-community/Meta-Llama-3-8B-Instruct-4bit
Text Generation • 2B • Updated • 3.98k • 78 -
mlabonne/Meta-Llama-3-120B-Instruct
Text Generation • 122B • Updated • 14 • 201
Papers-Fundamentals
-
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper • 2104.09864 • Published • 14 -
Attention Is All You Need
Paper • 1706.03762 • Published • 79 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 62 -
Zero-Shot Tokenizer Transfer
Paper • 2405.07883 • Published • 5
TFM: TimeSeries Foundation Models
Papers-Benchmarks
-
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Paper • 2406.08587 • Published • 16 -
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning
Paper • 2406.09170 • Published • 28 -
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Paper • 2407.18901 • Published • 35 -
Benchmarking Agentic Workflow Generation
Paper • 2410.07869 • Published • 28
LLMs-EmbeddingModels
Select Embedding Models Collection.
LLMs + Mamba
LLM + Datasets : Finance