Update README.md
Browse files
README.md
CHANGED
@@ -15,12 +15,21 @@ tags:
|
|
15 |
|
16 |
## Model Details
|
17 |
|
18 |
-
**Work in progress!!**
|
19 |
-
|
20 |
This PEFT adapter has been trained by using [Flower](https://flower.ai/), a friendly federated AI framework.
|
21 |
|
22 |
-
The adapter and benchmark results
|
23 |
|
24 |
Please check the following GitHub project for details on how to reproduce training and evaluation steps:
|
25 |
|
26 |
-
[FlowerTune-LLM-Labs](https://github.com/ethicalabs-ai/FlowerTune-LLM-Labs/blob/main/workspace/models/README.md)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
## Model Details
|
17 |
|
|
|
|
|
18 |
This PEFT adapter has been trained by using [Flower](https://flower.ai/), a friendly federated AI framework.
|
19 |
|
20 |
+
The adapter and benchmark results has been be submitted to the [FlowerTune LLM Code Leaderboard](https://flower.ai/benchmarks/llm-leaderboard/code/).
|
21 |
|
22 |
Please check the following GitHub project for details on how to reproduce training and evaluation steps:
|
23 |
|
24 |
+
[FlowerTune-LLM-Labs](https://github.com/ethicalabs-ai/FlowerTune-LLM-Labs/blob/main/workspace/models/README.md)
|
25 |
+
|
26 |
+
|
27 |
+
## Evaluation Results (Pass@1 score)
|
28 |
+
|
29 |
+
- **HumanEval**: 64.63 %
|
30 |
+
- **MBPP**: 54.8 %
|
31 |
+
- **MultiPL-E (C++)**: 60.87 %
|
32 |
+
- **MultiPL-E (JS)**: 61.49 %
|
33 |
+
- **Average**: 60.45 %
|
34 |
+
|
35 |
+
The evaluation was conducted on an NVIDIA A40 (48 GB).
|