Submitted by HowieHwong 38 On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective · 66 authors 2
Submitted by myownskyW7 33 SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation · 9 authors 2
Submitted by Hao605 33 RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning · 14 authors 2
Submitted by akhaliq 25 Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering · 3 authors 4
Submitted by Guanzheng 22 LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization · 4 authors 2
Submitted by michaelzhiluo 15 Autellix: An Efficient Serving Engine for LLM Agents as General Programs · 11 authors 2
Submitted by YuchengShi 10 SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question Answering? · 7 authors 2
Submitted by cooperleong00 9 Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region · 4 authors 2
Submitted by DrishtiSharma 6 InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning · 20 authors 2
Submitted by acharkq 6 NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation · 10 authors 2
Submitted by yuliang03181 6 AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence · 13 authors 2
Submitted by junzhang98 5 Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models · 9 authors 2
Submitted by mmhamdy 4 From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions · 8 authors 3
Submitted by danny911kr 3 REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation · 5 authors 2
Submitted by fdschmidt93 3 MVL-SIB: A Massively Multilingual Vision-Language Benchmark for Cross-Modal Topical Matching · 4 authors 2
Submitted by oneonlee 3 REFIND: Retrieval-Augmented Factuality Hallucination Detection in Large Language Models · 2 authors 2
Submitted by floschne 3 GIMMICK -- Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking · 4 authors 2
Submitted by hyp1231 3 ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation · 8 authors 3
Submitted by nbalepur 2 Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above · 3 authors 2
Submitted by rahmanidashti 2 Judging the Judges: A Collection of LLM-Generated Relevance Judgements · 9 authors 2
Submitted by ludolara 2 Reducing Hallucinations in Language Model-based SPARQL Query Generation Using Post-Generation Memory Retrieval · 4 authors 2
Submitted by XiangZ 2 High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion · 5 authors 2
Submitted by yyyaoyuan - Noise May Contain Transferable Knowledge: Understanding Semi-supervised Heterogeneous Domain Adaptation from an Empirical Perspective · 5 authors 2