A Survey of Context Engineering for Large Language Models
Paper
•
2507.13334
•
Published
•
259
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper
•
2507.15846
•
Published
•
133
ScreenCoder: Advancing Visual-to-Code Generation for Front-End
Automation via Modular Multimodal Agents
Paper
•
2507.22827
•
Published
•
99
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility,
Reasoning, and Efficiency
Paper
•
2508.18265
•
Published
•
211
Group Sequence Policy Optimization
Paper
•
2507.18071
•
Published
•
316
Why Language Models Hallucinate
Paper
•
2509.04664
•
Published
•
195
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual
Search
Paper
•
2509.07969
•
Published
•
58
Visual Representation Alignment for Multimodal Large Language Models
Paper
•
2509.07979
•
Published
•
83
Detect Anything via Next Point Prediction
Paper
•
2510.12798
•
Published
•
46
Less is More: Recursive Reasoning with Tiny Networks
Paper
•
2510.04871
•
Published
•
501
Diffusion Language Models are Super Data Learners
Paper
•
2511.03276
•
Published
•
128