view post Post 209 QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management See translation 👍 1 1 + Reply
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published 7 days ago • 54 • 9
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published 7 days ago • 54 • 9
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 22 days ago • 228 • 6
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Paper • 2512.01816 • Published 23 days ago • 88 • 5
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27 • 96
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation Paper • 2510.06303 • Published Oct 7 • 15
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention Paper • 2510.04212 • Published Oct 5 • 23
AInstein: Assessing the Feasibility of AI-Generated Approaches to Research Problems Paper • 2510.05432 • Published Oct 6 • 6 • 4
REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration Paper • 2510.01879 • Published Oct 2 • 8