Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,21 @@ pipeline_tag: text-generation
|
|
13 |
|
14 |
This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
* Model license: Llama 2 Community License Agreement
|
17 |
* Basic usage: [notebook](assets/basic_inference_llama_2_70b_dolphin.ipynb)
|
18 |
* Finetuning code: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-70b-dolphin-peft.py)
|
|
|
13 |
|
14 |
This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
|
15 |
|
16 |
+
### Benchmark metrics
|
17 |
+
|
18 |
+
| Metric | Value |
|
19 |
+
|-----------------------|-------|
|
20 |
+
| MMLU (5-shot) | 69.18 |
|
21 |
+
| ARC (25-shot) | 69.62 |
|
22 |
+
| HellaSwag (10-shot) | 86.82 |
|
23 |
+
| TruthfulQA (0-shot) | 57.43 |
|
24 |
+
| Avg. | 70.76 |
|
25 |
+
|
26 |
+
We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
27 |
+
|
28 |
+
### Helpful Links
|
29 |
+
|
30 |
+
|
31 |
* Model license: Llama 2 Community License Agreement
|
32 |
* Basic usage: [notebook](assets/basic_inference_llama_2_70b_dolphin.ipynb)
|
33 |
* Finetuning code: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-70b-dolphin-peft.py)
|