Update README.md
Browse files
README.md
CHANGED
@@ -1,37 +1,44 @@
|
|
1 |
---
|
2 |
base_model: llava-hf/llava-onevision-qwen2-0.5b-ov-hf
|
3 |
library_name: peft
|
|
|
|
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
|
10 |
-
|
11 |
|
12 |
## Model Details
|
13 |
|
14 |
### Model Description
|
15 |
|
16 |
-
|
|
|
|
|
|
|
17 |
|
18 |
|
19 |
|
20 |
-
- **Developed by:**
|
21 |
-
- **
|
22 |
-
- **Shared by [optional]:** [More Information Needed]
|
23 |
- **Model type:** [More Information Needed]
|
24 |
-
- **Language(s) (NLP):**
|
25 |
-
- **License:**
|
26 |
-
- **Finetuned from model [optional]:**
|
27 |
|
28 |
### Model Sources [optional]
|
29 |
|
30 |
<!-- Provide the basic links for the model. -->
|
31 |
|
32 |
-
- **Repository:**
|
33 |
-
- **Paper
|
34 |
-
- **Demo
|
35 |
|
36 |
## Uses
|
37 |
|
|
|
1 |
---
|
2 |
base_model: llava-hf/llava-onevision-qwen2-0.5b-ov-hf
|
3 |
library_name: peft
|
4 |
+
license: mit
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
tags:
|
8 |
+
- chemistry
|
9 |
---
|
10 |
|
11 |
# Model Card for Model ID
|
12 |
|
13 |
<!-- Provide a quick summary of what the model is/does. -->
|
14 |
|
15 |
+
The model is finetuned with images of robot manipulation in chemistry labs.
|
16 |
|
17 |
## Model Details
|
18 |
|
19 |
### Model Description
|
20 |
|
21 |
+
The model is based on LLaVA-OneVision 0.5B, fine-tuned for visual inspection and reasoning in laboratory automation tasks. The model processes image inputs and generates Boolean inspection results (True/False) with detailed reasoning, enabling error detection and recovery in robotic workflows.
|
22 |
+
Fine-tuning was performed using LoRA on both the vision encoder and projector, optimizing efficiency while maintaining accuracy. The model operates on edge devices (tested with NVIDIA AGX Orin), making it suitable for real-time decision-making in resource-constrained environments.
|
23 |
+
Trained on a curated dataset of laboratory environments, the VLM is capable of detecting object misalignment and positioning errors. When an error is detected, it provides natural language reasoning about the issue, supporting automated corrective actions in robotics workflows.
|
24 |
+
This model is particularly useful for scientific automation, self-driving labs (SDLs), and robotic inspection systems, enhancing workflow robustness and efficiency in real-world experimental setups.
|
25 |
|
26 |
|
27 |
|
28 |
+
- **Developed by:** Zhengxue Zhou
|
29 |
+
- **Shared by:** Zhengxue Zhou
|
|
|
30 |
- **Model type:** [More Information Needed]
|
31 |
+
- **Language(s) (NLP):** English
|
32 |
+
- **License:** MIT
|
33 |
+
- **Finetuned from model [optional]:** llava-hf/llava-onevision-qwen2-0.5b-ov-hf
|
34 |
|
35 |
### Model Sources [optional]
|
36 |
|
37 |
<!-- Provide the basic links for the model. -->
|
38 |
|
39 |
+
- **Repository:** https://github.com/cooper-group-uol-robotics/LIRA.git
|
40 |
+
- **Paper :** LIRA: Localization, Inspection, and Reasoning Module for Autonomous Workflows in Self-Driving Labs (Under Review)
|
41 |
+
- **Demo :** Following the introduction in the repository
|
42 |
|
43 |
## Uses
|
44 |
|