Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ In this project, we introduce GiT (Generalist Vision Transformer). GiT has the f
|
|
18 |
- ๐ฎ **Minimalist architecture design similar to LLM**: GiT consists solely of a single transformer, without the inclusion of additional vision encoder and adapter.
|
19 |
- ๐ **Covering all types of visual understanding tasks**: GiT addresses a spectrum of visual tasks, including object-level tasks (e.g., objecte detection), pixel-level tasks (e.g., semantic segmentation) and vision-language tasks (e.g., image captioning).
|
20 |
- ๐ค **Achieving task synergy by unified language interface**: Similar to LLM, GiT observes task synergy effect in multi-task training.
|
21 |
-
- ๐ฅ **
|
22 |
|
23 |
|
24 |

|
|
|
18 |
- ๐ฎ **Minimalist architecture design similar to LLM**: GiT consists solely of a single transformer, without the inclusion of additional vision encoder and adapter.
|
19 |
- ๐ **Covering all types of visual understanding tasks**: GiT addresses a spectrum of visual tasks, including object-level tasks (e.g., objecte detection), pixel-level tasks (e.g., semantic segmentation) and vision-language tasks (e.g., image captioning).
|
20 |
- ๐ค **Achieving task synergy by unified language interface**: Similar to LLM, GiT observes task synergy effect in multi-task training.
|
21 |
+
- ๐ฅ **Strong performance on zero-shot and few-shot benchmark**: GiT scales well with model size and data, demonstrating remarkable generalizability across diverse scenarios after trained on 27 datasets.
|
22 |
|
23 |
|
24 |

|