Flex-Judge Collection Collections of models and papers for works: "Flex-Judge: Think Once, Judge Anywhere" • 3 items • Updated Jun 4 • 2
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published 20 days ago • 64
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published 20 days ago • 64
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness Paper • 2505.22960 • Published May 29 • 16
DistiLLM: Towards Streamlined Distillation for Large Language Models Paper • 2402.03898 • Published Feb 6, 2024 • 3
Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning Paper • 2303.11101 • Published Mar 20, 2023 • 1
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation Paper • 2305.11685 • Published May 19, 2023 • 2
DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models Paper • 2305.15194 • Published May 24, 2023
DistiLLM: Towards Streamlined Distillation for Large Language Models Paper • 2402.03898 • Published Feb 6, 2024 • 3
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published Mar 10 • 32
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published Mar 10 • 32