ikedachin
/

llm-jp-3-13b-october-news-250311-merged

Text Generation

text-generation-inference

Model card Files Files and versions Community

ikedachin commited on Mar 11

Commit

f3f9ce1

·

verified ·

1 Parent(s): 6a32981

Update README.md

Files changed (1) hide show

README.md +11 -0

README.md CHANGED Viewed

@@ -21,3 +21,14 @@ language:
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+2024年9月、10月のニュース情報を継続事前学習したもの
+epoch:3
+r:128
+lora_alpha:512
+lr:3e-4
+embedding_lr: 3e-5
+狙い: r、lora_alphaを大きくして、SFTによる記憶忘却に耐えられるようにCPTによるベースモデルへの知識埋め込みの影響を大きくする