White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? Paper • 2311.13110 • Published Nov 22, 2023 • 1
Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models Paper • 2306.05272 • Published Jun 8, 2023
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs Paper • 2504.15280 • Published Apr 21 • 25
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 123