2 28 11

haoxintong

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

upvoted an article 10 days ago

SmolLM3: smol, multilingual, long-context reasoner

upvoted a paper 22 days ago

Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test

View all activity

Organizations

upvoted a paper 3 days ago

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

Paper • 2507.10532 • Published 4 days ago • 71

upvoted an article 10 days ago

Article

SmolLM3: smol, multilingual, long-context reasoner

and 22 others •

11 days ago

• 553

upvoted a paper 22 days ago

Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test

Paper • 2506.21551 • Published 22 days ago • 28

upvoted a paper 24 days ago

LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Paper • 2506.18841 • Published 25 days ago • 56

upvoted 5 papers about 1 month ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 254

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Paper • 2506.09113 • Published Jun 10 • 97

Cartridges: Lightweight and general-purpose long context representations via self-study

Paper • 2506.06266 • Published Jun 6 • 5

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 247

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 133

upvoted 7 papers about 2 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 170

Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning

Paper • 2505.21067 • Published May 27 • 3

Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions

Paper • 2505.19949 • Published May 26 • 16

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26 • 44

QwenLong-CPRS: Towards infty-LLMs with Dynamic Context Optimization

Paper • 2505.18092 • Published May 23 • 44

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

Paper • 2505.11896 • Published May 17 • 58

upvoted 4 papers 2 months ago

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 148

AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection

Paper • 2505.07293 • Published May 12 • 27

Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset

Paper • 2412.02595 • Published Dec 3, 2024 • 5

NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning

Paper • 2504.13941 • Published Apr 15 • 11

haoxintong

AI & ML interests

Recent Activity

Organizations

haoxintong's activity

SmolLM3: smol, multilingual, long-context reasoner