17 19 24

Shizhe Diao

shizhediao2

https://shizhediao.github.io/

AI & ML interests

LLM pre-training and reasoning

Recent Activity

upvoted a paper 2 days ago

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

liked a dataset 4 days ago

shizhediao/SCP-116K-cleaned

new activity 14 days ago

nvidia/Nemotron-Research-Reasoning-Qwen-1.5B:Add library name and pipeline tag

View all activity

Organizations

shizhediao2's activity

upvoted a paper 2 days ago

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Paper • 2506.09991 • Published 8 days ago • 55

upvoted a paper 17 days ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published 20 days ago • 125

upvoted a paper 20 days ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published 22 days ago • 42

upvoted a paper 23 days ago

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 27 days ago • 78

upvoted a paper about 1 month ago

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

Paper • 2505.10610 • Published May 15 • 53

upvoted 2 papers 2 months ago

Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models

Paper • 2503.08275 • Published Mar 11 • 3

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92

upvoted a paper 3 months ago

Scaling Vision Pre-Training to 4K Resolution

Paper • 2503.19903 • Published Mar 25 • 42

upvoted a paper 4 months ago

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57

upvoted a paper 6 months ago

NVILA: Efficient Frontier Visual Language Models

Paper • 2412.04468 • Published Dec 5, 2024 • 60

upvoted a paper 7 months ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20, 2024 • 46

upvoted 3 papers 8 months ago

MM-Ego: Towards Building Egocentric Multimodal LLMs

Paper • 2410.07177 • Published Oct 9, 2024 • 22

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 71

3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection

Paper • 2410.01647 • Published Oct 2, 2024 • 31

upvoted 4 papers 9 months ago

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published Sep 6, 2024 • 49