README.md · solidrust/OrpoLlama-3-8B-AWQ at 374889687312b7cfbe352783913606aeca87bd1c

OrpoLlama-3-8B-AWQ / README.md

Added base_model tag in README.md

3748896 verified about 1 year ago

1.21 kB

	---
	base_model: mlabonne/OrpoLlama-3-8B
	language:
	- en
	license: other
	library_name: transformers
	datasets:
	- mlabonne/orpo-dpo-mix-40k
	tags:
	- 4-bit
	- AWQ
	- text-generation
	- autotrain_compatible
	- endpoints_compatible
	- orpo
	- llama 3
	- rlhf
	- sft
	pipeline_tag: text-generation
	inference: false
	quantized_by: Suparious
	---
	# mlabonne/OrpoLlama-3-8B AWQ

	- Model creator: [mlabonne](https://huggingface.co/mlabonne)
	- Original model: [OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B)

	![](https://i.imgur.com/ZHwzQvI.png)

	## Model Summary

	This is an ORPO fine-tune of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on 1k samples of [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) created for [this article](https://huggingface.co/blog/mlabonne/orpo-llama-3).

	It's a successful fine-tune that follows the ChatML template!

	Try the demo: https://huggingface.co/spaces/mlabonne/OrpoLlama-3-8B

	## 🔎 Application

	This model uses a context window of 8k. It was trained with the ChatML template.

	## 🏆 Evaluation

	### Nous

	OrpoLlama-4-8B outperforms Llama-3-8B-Instruct on the GPT4All and TruthfulQA datasets.