LZXzju commited on
Commit
bc48b53
·
verified ·
1 Parent(s): 312bb61

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -14
README.md CHANGED
@@ -14,20 +14,22 @@ Project page: https://github.com/lll6gg/UI-R1
14
 
15
  #### Benchmark 1: ScreenSpotV2
16
 
17
- | ScreenSpotV2 | Mobile-T | Mobile-I | Desktop-T | Desktop-I | Web-T | Web-I | Avg↑ / Len↓ |
18
- | ---------------- | -------- | -------- | --------- | --------- | -------- | -------- | ------------- |
19
- | OS-ATLAS-7B | 95.2 | 75.8 | 90.7 | 63.6 | 90.6 | 77.3 | 84.1 / |
20
- | UI-TARS-7B | 95.2 | 79.1 | 90.7 | 68.6 | 90.6 | 78.3 | 84.7 / |
21
- | UI-R1-3B | 96.2 | **84.3** | 92.3 | 63.6 | 89.2 | 75.4 | 85.4 / 67 |
22
- | GUI-R1-3B | 97.6 | 78.2 | 94.3 | 64.3 | 91.0 | 72.4 | 85.0 / 80 |
23
- | UI-R1-E-3B | **98.2** | 83.9 | **94.8** | **75.0** | **93.2** | **83.7** | **89.5** / **28** |
 
24
 
25
  #### Benchmark 2: ScreenSpot-Pro
26
 
27
- | ScreenSpot-Pro | Average Length↓ | Average Accuracy↑ |
28
- | ---------------- | -------------- | ---------------- |
29
- | UGround-7B | - | 16.5 |
30
- | OS-ATLAS-7B | - | 18.9 |
31
- | UI-R1-3B | 102 | 17.8 |
32
- | GUI-R1-3B | 114 | 26.6 |
33
- | UI-R1-E-3B | **28** | **33.5** |
 
 
14
 
15
  #### Benchmark 1: ScreenSpotV2
16
 
17
+ | ScreenSpotV2 | inference mode | Mobile-T | Mobile-I | Desktop-T | Desktop-I | Web-T | Web-I | Avg↑ / Len↓ |
18
+ | ------------- | -------------- | -------- | -------- | --------- | --------- | -------- | -------- | ----------------- |
19
+ | OS-ATLAS-7B | w/o thinking | 95.2 | 75.8 | 90.7 | 63.6 | 90.6 | 77.3 | 84.1 / |
20
+ | UI-TARS-7B | w/o thinking | 95.2 | 79.1 | 90.7 | 68.6 | 90.6 | 78.3 | 84.7 / |
21
+ | UI-R1-3B (v1) | w/ thinking | 96.2 | **84.3** | 92.3 | 63.6 | 89.2 | 75.4 | 85.4 / 67 |
22
+ | GUI-R1-3B | w/ thinking | 97.6 | 78.2 | 94.3 | 64.3 | 91.0 | 72.4 | 85.0 / 80 |
23
+ | UI-R1-3B (v2) | w/ thinking | 97.6 | 79.6 | 92.3 | 67.9 | 88.9 | 77.8 | 85.8 / 60 |
24
+ | **UI-R1-E-3B** | w/o thinking | **98.2** | 83.9 | **94.8** | **75.0** | **93.2** | **83.7** | **89.5** / **28** |
25
 
26
  #### Benchmark 2: ScreenSpot-Pro
27
 
28
+ | ScreenSpot-Pro | inference mode | Average Length↓ | Average Accuracy↑ |
29
+ | -------------- | -------------- | --------------- | ---------------- |
30
+ | UGround-7B | w/o thinking | - | 16.5 |
31
+ | OS-ATLAS-7B | w/o thinking | - | 18.9 |
32
+ | UI-R1-3B (v1) | w/ thinking | 102 | 17.8 |
33
+ | GUI-R1-3B | w/ thinking | 114 | 26.6 |
34
+ | UI-R1-3B (v2) | w/ thinking | 129 | 29.8 |
35
+ | **UI-R1-E-3B** | w/o thinking | **28** | **33.5** |