Lambent
/

danube3.1-4b-Reasoning-1Epoch

Text Generation

text-generation-inference

Model card Files Files and versions

danube3.1-4b-Reasoning-1Epoch / README.md

Lambent's picture

Update README.md

74d122f verified 7 months ago

|

history blame contribute delete

739 Bytes

	---
	base_model: Lambent/danube3.1-4b-Reasoning-Light
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	license: apache-2.0
	language:
	- en
	---

	Warning: Training went fine until about 2.4k steps in and then it declined and started glitching. Either overcooked or in its experimental phase.

	# Uploaded model

	- Developed by: Lambent
	- License: apache-2.0
	- Finetuned from model : Lambent/danube3.1-4b-Reasoning-Light

	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)