yamayou
's Collections
Beyond A*: Better Planning with Transformers via Search Dynamics
Bootstrapping
Paper
•
2402.14083
•
Published
•
48
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
612
Genie: Generative Interactive Environments
Paper
•
2402.15391
•
Published
•
71
Humanoid Locomotion as Next Token Prediction
Paper
•
2402.19469
•
Published
•
28
ViTAR: Vision Transformer with Any Resolution
Paper
•
2403.18361
•
Published
•
55
Simulating Classroom Education with LLM-Empowered Agents
Paper
•
2406.19226
•
Published
•
31
MIRAI: Evaluating LLM Agents for Event Forecasting
Paper
•
2407.01231
•
Published
•
18
Prithvi WxC: Foundation Model for Weather and Climate
Paper
•
2409.13598
•
Published
•
42
Selective Attention Improves Transformer
Paper
•
2410.02703
•
Published
•
24
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Paper
•
2411.17465
•
Published
•
86
Chimera: Improving Generalist Model with Domain-Specific Experts
Paper
•
2412.05983
•
Published
•
9
Multimodal Latent Language Modeling with Next-Token Diffusion
Paper
•
2412.08635
•
Published
•
45
Large Action Models: From Inception to Implementation
Paper
•
2412.10047
•
Published
•
35
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper
•
2412.09871
•
Published
•
96
AnySat: An Earth Observation Model for Any Resolutions, Scales, and
Modalities
Paper
•
2412.14123
•
Published
•
11
Cosmos World Foundation Model Platform for Physical AI
Paper
•
2501.03575
•
Published
•
78
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta
Chain-of-Though
Paper
•
2501.04682
•
Published
•
96
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot
Planning
Paper
•
2411.04983
•
Published
•
12
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth
Approach
Paper
•
2502.05171
•
Published
•
135
VideoRoPE: What Makes for Good Video Rotary Position Embedding?
Paper
•
2502.05173
•
Published
•
65
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse
Attention
Paper
•
2502.11089
•
Published
•
151
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context
Memory of Transformers
Paper
•
2502.15007
•
Published
•
171
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts
Paper
•
2502.20395
•
Published
•
46
RWKV-7 "Goose" with Expressive Dynamic State Evolution
Paper
•
2503.14456
•
Published
•
136
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning
Paper
•
2503.15558
•
Published
•
45
Advances and Challenges in Foundation Agents: From Brain-Inspired
Intelligence to Evolutionary, Collaborative, and Safe Systems
Paper
•
2504.01990
•
Published
•
205
Paper
•
2504.00927
•
Published
•
38