kdhole
/

Llama-3.1-8B-tuned-lens

Model card Files Files and versions

Llama-3.1-8B-tuned-lens / README.md

kdhole's picture

Update README.md

1fb082f verified 11 months ago

|

history blame contribute delete

730 Bytes

	---
	license: openrail
	datasets:
	- bookcorpus/bookcorpus
	language:
	- en
	base_model:
	- meta-llama/Llama-3.1-8B
	---

	This is the tuned lens version of LLama 3.1 8B (i.e. all the layers of the model have been trained to minimize the KL divergence with the last layer).

	```bash
	pip install tuned-lens
	```
	```bash
	python -m tuned_lens train \
	--model.name meta-llama/Llama-3.1-8B \
	--data.name bookcorpus/bookcorpus \
	--per_gpu_batch_size=1 \
	--output my_lenses/meta-llama/Llama-3.1-8B
	```
	```bash
	python -m tuned_lens eval \
	--data.name bookcorpus/bookcorpus \
	--model.name meta-llama/Llama-3.1-8B \
	--tokens 16400000 \
	--lens_name my_lenses/meta-llama/Llama-3.1-8B \
	--output evaluation/meta-llama/Llama-3.1-8B
	```