The model is derived from Llama-3.1-8B through pruning using LLM-Streamline (Streamlining Redundant Layers to Compress Large Language Models, ICLR 2025 Spotlight). The entire training process required only 1.3B tokens.

Below are the results of the evaluation using lm-eval:

	arc_c	arc_e	boolq	hellaswag	openbookqa	rte	winogrande	Avg
Llama-3.1-8B	50.4	80.3	81.2	60.2	34.8	67.9	73.0	64.0
Llama-3.1-5.4B	42.1	72.2	78.0	54.3	27.2	62.8	71.0	58.2

Downloads last month: 15

Safetensors

Model size

5.41B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for XiaodongChen/Llama-3.1-5.4B

Base model

meta-llama/Llama-3.1-8B

Finetuned

(935)

this model

Quantizations

2 models