Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,7 @@
|
|
2 |
base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
3 |
library_name: transformers
|
4 |
datasets:
|
5 |
-
-
|
6 |
tags:
|
7 |
- alignment-handbook
|
8 |
- llama
|
@@ -22,6 +22,8 @@ In comparison to the preference data construction method in our paper, it employ
|
|
22 |
2. When multiple outputs have the same highest score, the one with the shortest length is selected.
|
23 |
3. When multiple outputs have the same minimum score, the one with the smallest length difference from the chosen output is selected.
|
24 |
|
|
|
|
|
25 |
### [AlpacaEval Eval Results](https://tatsu-lab.github.io/alpaca_eval/)
|
26 |
| Model | LC | WR | Avg. Length |
|
27 |
|-------------------------------------------|:------------:|:--------:|:-----------:|
|
|
|
2 |
base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
3 |
library_name: transformers
|
4 |
datasets:
|
5 |
+
- wzhouad/llama3-ultrafeedback-hybrid-v2
|
6 |
tags:
|
7 |
- alignment-handbook
|
8 |
- llama
|
|
|
22 |
2. When multiple outputs have the same highest score, the one with the shortest length is selected.
|
23 |
3. When multiple outputs have the same minimum score, the one with the smallest length difference from the chosen output is selected.
|
24 |
|
25 |
+
The model is trained based on [wzhouad/llama3-ultrafeedback-hybrid-v2](https://huggingface.co/datasets/wzhouad/llama3-ultrafeedback-hybrid-v2).
|
26 |
+
|
27 |
### [AlpacaEval Eval Results](https://tatsu-lab.github.io/alpaca_eval/)
|
28 |
| Model | LC | WR | Avg. Length |
|
29 |
|-------------------------------------------|:------------:|:--------:|:-----------:|
|