Sayambhu Sen
Testerpce
AI & ML interests
None yet
Recent Activity
updated
a collection
5 minutes ago
Prompt papers
updated
a collection
about 19 hours ago
Data
updated
a collection
1 day ago
Multimodal
Organizations
None yet
World model
Compression
Test time scaling
SAE
Theory and Representation learning
Graph
Search
Diversity
Self correction
Speech
-
Slamming: Training a Speech Language Model on One GPU in a Day
Paper • 2502.15814 • Published • 70 -
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
Paper • 2503.04724 • Published • 69 -
Audio-Aware Large Language Models as Judges for Speaking Styles
Paper • 2506.05984 • Published • 14 -
Optimizing Multilingual Text-To-Speech with Accents & Emotions
Paper • 2506.16310 • Published • 22
Synthetic data
MoE
-
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper • 2410.10814 • Published • 52 -
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Paper • 2502.16894 • Published • 31 -
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
Paper • 2506.14731 • Published • 9 -
SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation
Paper • 2506.18349 • Published • 10
Markov chain
Planning
-
Compositional Foundation Models for Hierarchical Planning
Paper • 2309.08587 • Published • 11 -
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
Paper • 2405.09220 • Published • 29 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 19
Multilingual
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 85 -
Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages
Paper • 2401.05811 • Published • 8 -
Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis
Paper • 2409.20059 • Published • 17 -
Are Character-level Translations Worth the Wait? Comparing Character- and Subword-level Models for Machine Translation
Paper • 2302.14220 • Published
Partial layer training LLMs
-
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 23 -
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Paper • 2505.20355 • Published • 36 -
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 42
Evaluation
Math
-
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 55 -
Solving Inequality Proofs with Large Language Models
Paper • 2506.07927 • Published • 20 -
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Paper • 2507.00432 • Published • 46
Style transfer
Reinforcement learning
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 25 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 39 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 100 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 74
Knowledge
-
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models
Paper • 2408.15915 • Published • 20 -
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 29 -
ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation
Paper • 2503.21729 • Published • 28 -
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Paper • 2504.00509 • Published • 22
Vision
Code
Data
Memory
Applications and Uses
Adversarial
Multimodal
-
Qwen2.5-Omni Technical Report
Paper • 2503.20215 • Published • 161 -
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO
Paper • 2505.22453 • Published • 45 -
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning
Paper • 2505.23380 • Published • 23 -
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
Paper • 2505.21523 • Published • 14
Interpretable
-
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
Large Language Models are Locally Linear Mappings
Paper • 2505.24293 • Published • 15 -
Thought Anchors: Which LLM Reasoning Steps Matter?
Paper • 2506.19143 • Published • 9
Diffusion
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 72 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 21 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
Information_retrieval
-
Rank1: Test-Time Compute for Reranking in Information Retrieval
Paper • 2502.18418 • Published • 27 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 32 -
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval
Paper • 2505.16967 • Published • 23
Attention
Agent
-
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 69 -
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Paper • 2502.06060 • Published • 38 -
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 192 -
SurveyX: Academic Survey Automation via Large Language Models
Paper • 2502.14776 • Published • 100
RAG
-
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 49 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10 -
RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
Paper • 2412.12881 • Published • 2 -
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
Paper • 2506.06962 • Published • 28
Prompt papers
Sparsity
State space LLM
Reasoning
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 36 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 26
Fine tuning
-
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Paper • 2402.17193 • Published • 26 -
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 64 -
Direct Preference Optimization Using Sparse Feature-Level Constraints
Paper • 2411.07618 • Published • 16 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 55
Dataset and Data processing
-
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Paper • 2405.20541 • Published • 24 -
RedPajama: an Open Dataset for Training Large Language Models
Paper • 2411.12372 • Published • 57 -
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Paper • 2503.22230 • Published • 44
Video understanding
-
Wolf: Captioning Everything with a World Summarization Framework
Paper • 2407.18908 • Published • 33 -
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Paper • 2407.19985 • Published • 37 -
TPDiff: Temporal Pyramid Video Diffusion Model
Paper • 2503.09566 • Published • 46 -
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
Paper • 2506.07464 • Published • 10
Long context
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 143 -
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Paper • 2410.10819 • Published • 8 -
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models
Paper • 2410.09342 • Published • 40 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 54
Vision Language Action models
Vision
World model
Code
Compression
Data
Test time scaling
Memory
SAE
Applications and Uses
Theory and Representation learning
Adversarial
Graph
Multimodal
-
Qwen2.5-Omni Technical Report
Paper • 2503.20215 • Published • 161 -
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO
Paper • 2505.22453 • Published • 45 -
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning
Paper • 2505.23380 • Published • 23 -
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
Paper • 2505.21523 • Published • 14
Search
Interpretable
-
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
Large Language Models are Locally Linear Mappings
Paper • 2505.24293 • Published • 15 -
Thought Anchors: Which LLM Reasoning Steps Matter?
Paper • 2506.19143 • Published • 9
Diversity
Diffusion
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 72 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 21 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
Self correction
Information_retrieval
-
Rank1: Test-Time Compute for Reranking in Information Retrieval
Paper • 2502.18418 • Published • 27 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 32 -
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval
Paper • 2505.16967 • Published • 23
Speech
-
Slamming: Training a Speech Language Model on One GPU in a Day
Paper • 2502.15814 • Published • 70 -
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
Paper • 2503.04724 • Published • 69 -
Audio-Aware Large Language Models as Judges for Speaking Styles
Paper • 2506.05984 • Published • 14 -
Optimizing Multilingual Text-To-Speech with Accents & Emotions
Paper • 2506.16310 • Published • 22
Attention
Synthetic data
Agent
-
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 69 -
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Paper • 2502.06060 • Published • 38 -
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 192 -
SurveyX: Academic Survey Automation via Large Language Models
Paper • 2502.14776 • Published • 100
MoE
-
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper • 2410.10814 • Published • 52 -
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Paper • 2502.16894 • Published • 31 -
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
Paper • 2506.14731 • Published • 9 -
SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation
Paper • 2506.18349 • Published • 10
RAG
-
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 49 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10 -
RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
Paper • 2412.12881 • Published • 2 -
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
Paper • 2506.06962 • Published • 28
Markov chain
Prompt papers
Planning
-
Compositional Foundation Models for Hierarchical Planning
Paper • 2309.08587 • Published • 11 -
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
Paper • 2405.09220 • Published • 29 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 19
Sparsity
Multilingual
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 85 -
Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages
Paper • 2401.05811 • Published • 8 -
Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis
Paper • 2409.20059 • Published • 17 -
Are Character-level Translations Worth the Wait? Comparing Character- and Subword-level Models for Machine Translation
Paper • 2302.14220 • Published
State space LLM
Partial layer training LLMs
-
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 23 -
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Paper • 2505.20355 • Published • 36 -
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 42
Reasoning
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 36 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 26
Evaluation
Fine tuning
-
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Paper • 2402.17193 • Published • 26 -
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 64 -
Direct Preference Optimization Using Sparse Feature-Level Constraints
Paper • 2411.07618 • Published • 16 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 55
Math
-
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 55 -
Solving Inequality Proofs with Large Language Models
Paper • 2506.07927 • Published • 20 -
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Paper • 2507.00432 • Published • 46
Dataset and Data processing
-
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Paper • 2405.20541 • Published • 24 -
RedPajama: an Open Dataset for Training Large Language Models
Paper • 2411.12372 • Published • 57 -
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Paper • 2503.22230 • Published • 44
Style transfer
Video understanding
-
Wolf: Captioning Everything with a World Summarization Framework
Paper • 2407.18908 • Published • 33 -
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Paper • 2407.19985 • Published • 37 -
TPDiff: Temporal Pyramid Video Diffusion Model
Paper • 2503.09566 • Published • 46 -
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
Paper • 2506.07464 • Published • 10
Reinforcement learning
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 25 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 39 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 100 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 74
Long context
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 143 -
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Paper • 2410.10819 • Published • 8 -
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models
Paper • 2410.09342 • Published • 40 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 54
Knowledge
-
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models
Paper • 2408.15915 • Published • 20 -
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 29 -
ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation
Paper • 2503.21729 • Published • 28 -
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Paper • 2504.00509 • Published • 22