Hide101111001111000
/

llm-jp-3-13b-it_lora-DPO-ja

text-generation-inference

Model card Files Files and versions Community

Hide101111001111000 commited on Dec 17, 2024

Commit

a34b788

·

verified ·

1 Parent(s): aee1d18

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -20,3 +20,8 @@ language:
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+I performed DPO based on the already fine-tuned Hide101111001111000/llm-jp-3-13b-it_lora_3.
+--python--