HillFir commited on
Commit
08ddfb1
·
verified ·
1 Parent(s): b12c5b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -8
README.md CHANGED
@@ -7,11 +7,19 @@ language:
7
  metrics:
8
  - accuracy
9
  base_model:
10
- - Haozhan72/Openvla-oft-SFT-libero-goal-trajall
11
  pipeline_tag: reinforcement-learning
12
  model-index:
13
- - name: RLinf-openvlaoft-maniskill3-ppo
14
  results:
 
 
 
 
 
 
 
 
15
  - task:
16
  type: VLA
17
  dataset:
@@ -19,7 +27,7 @@ model-index:
19
  name: maniskill-vision
20
  metrics:
21
  - type: accuracy
22
- value: 80.5
23
  - task:
24
  type: VLA
25
  dataset:
@@ -27,7 +35,7 @@ model-index:
27
  name: maniskill-semantic
28
  metrics:
29
  - type: accuracy
30
- value: 56.6
31
  - task:
32
  type: VLA
33
  dataset:
@@ -35,8 +43,9 @@ model-index:
35
  name: maniskill-position
36
  metrics:
37
  - type: accuracy
38
- value: 56.1
39
  ---
 
40
  <div align="center">
41
  <img src="logo.svg" alt="RLinf-logo" width="500"/>
42
  </div>
@@ -61,7 +70,7 @@ model-index:
61
  </div>
62
 
63
  ## Model Description
64
- This openvla-oft model is trained on ``Haozhan72/Openvla-oft-SFT-libero10-trajall`` with an additional lora SFT checkpoint and finetuned by Proximal Policy Optimization (PPO) on the ManiSkill simulator.
65
 
66
  ## Full OOD Evaluation and Results
67
  ### Overall Eval Results
@@ -107,11 +116,11 @@ Note: rl4vla refers to the paper VLA-RL-Study: What Can RL Bring to VLA Generali
107
  | mid-episode object reposition | 0.8828 | 0.4570 | 0.7891 | **0.9212** | 0.8828 |
108
 
109
  ## How to Use
110
- Please integrate the provided model with the [RLinf](https://github.com/RLinf/RLinf) codebase. To do so, modify the following parameters in the configuration file ``examples/embodiment/config/maniskill_ppo_openvlaoft.yaml``:
111
 
112
  - Set ``actor.checkpoint_load_path``, ``actor.tokenizer.tokenizer_model``, and ``rollout.model_dir`` to the path of the model checkpoint.
113
 
114
  Note: If you intend to evaluate the model directly, make sure to set ``actor.model.is_lora`` to ``false``.
115
 
116
  ## License
117
- This code repository and the model weights are licensed under the MIT License.
 
7
  metrics:
8
  - accuracy
9
  base_model:
10
+ - gen-robot/openvla-7b-rlvla-warmup
11
  pipeline_tag: reinforcement-learning
12
  model-index:
13
+ - name: RLinf-openvla-maniskill3-ppo
14
  results:
15
+ - task:
16
+ type: VLA
17
+ dataset:
18
+ type: maniskill-train
19
+ name: maniskill-train
20
+ metrics:
21
+ - type: accuracy
22
+ value: 96.09
23
  - task:
24
  type: VLA
25
  dataset:
 
27
  name: maniskill-vision
28
  metrics:
29
  - type: accuracy
30
+ value: 82.03
31
  - task:
32
  type: VLA
33
  dataset:
 
35
  name: maniskill-semantic
36
  metrics:
37
  - type: accuracy
38
+ value: 78.35
39
  - task:
40
  type: VLA
41
  dataset:
 
43
  name: maniskill-position
44
  metrics:
45
  - type: accuracy
46
+ value: 85.42
47
  ---
48
+
49
  <div align="center">
50
  <img src="logo.svg" alt="RLinf-logo" width="500"/>
51
  </div>
 
70
  </div>
71
 
72
  ## Model Description
73
+ This model is trained on ``gen-robot/openvla-7b-rlvla-warmup`` by Proximal Policy Optimization (PPO) on the ManiSkill simulator.
74
 
75
  ## Full OOD Evaluation and Results
76
  ### Overall Eval Results
 
116
  | mid-episode object reposition | 0.8828 | 0.4570 | 0.7891 | **0.9212** | 0.8828 |
117
 
118
  ## How to Use
119
+ Please integrate the provided model with the [RLinf](https://github.com/RLinf/RLinf) codebase. To do so, modify the following parameters in the configuration file ``examples/embodiment/config/maniskill_ppo_openvla.yaml``:
120
 
121
  - Set ``actor.checkpoint_load_path``, ``actor.tokenizer.tokenizer_model``, and ``rollout.model_dir`` to the path of the model checkpoint.
122
 
123
  Note: If you intend to evaluate the model directly, make sure to set ``actor.model.is_lora`` to ``false``.
124
 
125
  ## License
126
+ This code repository and the model weights are licensed under the MIT License.