Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,13 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
---
|
4 |
+
# Hiera (hiera_base_224)
|
5 |
+
|
6 |
+
Hiera is a hierarchical transformer that is a much more efficient alternative to previous series of hierarchical transformers (ConvNeXT and Swin).
|
7 |
+
Vanilla transformer architectures (Dosovitskiy et al. 2020) are very popular yet simple and scalable architectures that enable pretraining strategies such as MAE (He et al., 2022).
|
8 |
+
However, they use the same spatial resolution and number of channels throughout the network, ViTs make inefficient use of their parameters. This
|
9 |
+
is in contrast to prior “hierarchical” or “multi-scale” models (e.g., Krizhevsky et al. (2012); He et al. (2016)), which use fewer channels but higher spatial resolution in early stages
|
10 |
+
with simpler features, and more channels but lower spatial resolution later in the model with more complex features.
|
11 |
+
These models are way too complex though which add overhead operations to achieve state-of-the-art accuracy in ImageNet-1k, making the model slower.
|
12 |
+
Hiera attempts to address this issue by teaching the model spatial biases by training MAE.
|
13 |
+

|