The model is derived from Llama-2-7b-hf through pruning using LLM-Streamline (Streamlining Redundant Layers to Compress Large Language Models, ICLR 2025 Spotlight). The entire training process required only 0.06B tokens.

Below are the results of the evaluation using lm-eval:

	arc_c	arc_e	boolq	hellaswag	openbookqa	rte	winogrande	Avg
Llama-2-7B	43.3	76.4	77.7	57.2	31.4	62.8	69.1	59.7
Llama-2-4.7B	34.0	64.6	74.7	49.8	27.4	61.7	66.4	54.1

Downloads last month: 22

Safetensors

Model size

4.71B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for XiaodongChen/Llama-2-4.7B

Base model

meta-llama/Llama-2-7b-hf

Finetuned

(784)

this model

Quantizations

2 models