Idea - a yamayou Collection

yamayou 's Collections

Idea

LLM

Idea

updated 3 days ago

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21, 2024 • 48
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 612
Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23, 2024 • 71
Humanoid Locomotion as Next Token Prediction

Paper • 2402.19469 • Published Feb 29, 2024 • 28
ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27, 2024 • 55
Simulating Classroom Education with LLM-Empowered Agents

Paper • 2406.19226 • Published Jun 27, 2024 • 31
MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1, 2024 • 18
Prithvi WxC: Foundation Model for Weather and Climate

Paper • 2409.13598 • Published Sep 20, 2024 • 42
Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3, 2024 • 24
ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 86
Chimera: Improving Generalist Model with Domain-Specific Experts

Paper • 2412.05983 • Published Dec 8, 2024 • 9
Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published Dec 11, 2024 • 45
Large Action Models: From Inception to Implementation

Paper • 2412.10047 • Published Dec 13, 2024 • 35
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 96
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities

Paper • 2412.14123 • Published Dec 18, 2024 • 11
Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published Jan 7 • 78
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 96
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

Paper • 2411.04983 • Published Nov 7, 2024 • 12
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 135
VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published Feb 7 • 65
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 151
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published Feb 20 • 171
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts

Paper • 2502.20395 • Published Feb 27 • 46
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published 21 days ago • 136
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published 20 days ago • 45
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published 8 days ago • 205
Multi-Token Attention

Paper • 2504.00927 • Published 7 days ago • 38