LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference Paper • 2104.01136 • Published Apr 2, 2021 • 1
ResMLP: Feedforward networks for image classification with data-efficient training Paper • 2105.03404 • Published May 7, 2021
Augmenting Convolutional networks with attention-based aggregation Paper • 2112.13692 • Published Dec 27, 2021
DINOv2: Learning Robust Visual Features without Supervision Paper • 2304.07193 • Published Apr 14, 2023 • 7
Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models Paper • 2301.11189 • Published Jan 26, 2023
Three things everyone should know about Vision Transformers Paper • 2203.09795 • Published Mar 18, 2022
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16, 2024 • 39