The model is derived from Llama-3.1-8B through pruning using LLM-Streamline (Streamlining Redundant Layers to Compress Large Language Models, ICLR 2025 Spotlight). The entire training process required only 1.3B tokens.
Below are the results of the evaluation using lm-eval:
arc_c | arc_e | boolq | hellaswag | openbookqa | rte | winogrande | Avg | |
---|---|---|---|---|---|---|---|---|
Llama-3.1-8B | 50.4 | 80.3 | 81.2 | 60.2 | 34.8 | 67.9 | 73.0 | 64.0 |
Llama-3.1-5.4B | 42.1 | 72.2 | 78.0 | 54.3 | 27.2 | 62.8 | 71.0 | 58.2 |
- Downloads last month
- 15
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no library tag.