shareAI
/

llama3-8b-Chinese-Instruct-DPO-beta0.5

Inference Endpoints

Model card Files Files and versions Community

Baicai003 commited on May 4, 2024

Commit

f1c3aed

·

verified ·

1 Parent(s): a0ee275

Update README.md

Files changed (1) hide show

README.md +38 -3

README.md CHANGED Viewed

@@ -1,3 +1,38 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- zh
+library_name: transformers
+tags:
+- llama
+- llama3
+---
+---
+frameworks:
+- Pytorch
+license: Apache License 2.0
+tasks:
+- chatbot
+language:
+- cn
+tags:
+- RL-tuned
+tools:
+- vllm
+---
+Github：https://github.com/CrazyBoyM/llama3-Chinese-chat
+放出训练配方细节供网友参考分享：
+DPO(beta 0.5) + lora rank128, alpha256 + 打开"lm_head", "input_layernorm", "post_attention_layernorm", "norm"层训练。
+偏好中文和emoji表情，且不损伤原instruct版模型能力。
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/631f5b422225f12fc0f2c838/2xlWxZvN0gahckA2EPmlE.png)
+Git下载
+```
+#Git模型下载
+git clone https://www.modelscope.cn/baicai003/Llama3-Chinese-instruct-DPO-beta0.5.git
+```