Update README.md
Browse files
README.md
CHANGED
|
@@ -29,8 +29,7 @@ Simple model that was RL FT for 20 steps / epochs after SFT to reverse text usin
|
|
| 29 |
</reversed_text>
|
| 30 |
```
|
| 31 |
|
| 32 |
-
**
|
| 33 |
-
ti otni degrem saw kcurB ni ytinummoc ehT
|
| 34 |
-
|
| 35 |
-
**Reward:**
|
| 36 |
0.963855421686747
|
|
|
|
|
|
|
|
|
| 29 |
</reversed_text>
|
| 30 |
```
|
| 31 |
|
| 32 |
+
**Expected Reward:**
|
|
|
|
|
|
|
|
|
|
| 33 |
0.963855421686747
|
| 34 |
+
|
| 35 |
+
Note: Reward is basd on the long common subsequence
|