Update README.md
Browse files
README.md
CHANGED
@@ -8,12 +8,12 @@ base_model:
|
|
8 |
- liuhaotian/llava-v1.5-7b
|
9 |
---
|
10 |
|
11 |
-
|
12 |
# Ada-LLaVA Model Card
|
13 |
|
14 |
<!-- Provide a quick summary of what the model is/does. -->
|
15 |
|
16 |
-
Ada-LLaVA
|
17 |
|
18 |
See the paper for more details: [Learning to Inference Adaptively for Multimodal Large Language Models](https://huggingface.co/papers/2503.10905)
|
19 |
|
@@ -55,4 +55,3 @@ AdaLLaVA is based on LLaVA-1.5 and thus follows its license. Llama 2 is licensed
|
|
55 |
## Limitations
|
56 |
|
57 |
While Ada-LLaVA is currently limited to processing one image at a time and only applies adaptive operations in its later half of layers, future work could explore multi-image input support and extend the adaptive mechanisms throughout the entire model architecture, including the vision encoder. These improvements would make the model more versatile and applicable to a broader range of real-world scenarios.
|
58 |
-
```
|
|
|
8 |
- liuhaotian/llava-v1.5-7b
|
9 |
---
|
10 |
|
11 |
+
|
12 |
# Ada-LLaVA Model Card
|
13 |
|
14 |
<!-- Provide a quick summary of what the model is/does. -->
|
15 |
|
16 |
+
Ada-LLaVA-L-7B is an open-source adaptive inference framework for multimodal Large Language Models (MLLMs) that dynamically adjusts its operations based on available computational resources and latency requirements.
|
17 |
|
18 |
See the paper for more details: [Learning to Inference Adaptively for Multimodal Large Language Models](https://huggingface.co/papers/2503.10905)
|
19 |
|
|
|
55 |
## Limitations
|
56 |
|
57 |
While Ada-LLaVA is currently limited to processing one image at a time and only applies adaptive operations in its later half of layers, future work could explore multi-image input support and extend the adaptive mechanisms throughout the entire model architecture, including the vision encoder. These improvements would make the model more versatile and applicable to a broader range of real-world scenarios.
|
|