-
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper • 2412.11768 • Published • 44 -
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Paper • 2501.06842 • Published • 16 -
The GAN is dead; long live the GAN! A Modern GAN Baseline
Paper • 2501.05441 • Published • 96
nDimensional
nDimensional
AI & ML interests
Computer Vision, Diffusers, Transformers, ML, NLP, Diffusion Models, Unsupervised Learning, JAX, Neural Networks
Recent Activity
liked
a Space
6 days ago
bytedance-research/USO
liked
a Space
6 days ago
Wan-AI/Wan2.2-S2V
liked
a model
6 days ago
google/embeddinggemma-300m
Organizations
None yet