dnotitia
/

Llama-DNA-1.0-8B-Instruct-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

likejazz commited on Jan 21

Commit

007265f

·

verified ·

1 Parent(s): 56a1103

Update README.md

Files changed (1) hide show

README.md +0 -2

README.md CHANGED Viewed

@@ -25,8 +25,6 @@ pipeline_tag: text-generation
 **DNA 1.0 8B Instruct** is a <u>state-of-the-art (**SOTA**)</u> bilingual language model based on Llama architecture, specifically optimized for Korean language understanding and generation, while also maintaining strong English capabilities. The model was developed through a sophisticated process involving model merging via spherical linear interpolation (**SLERP**) with Llama 3.1 8B Instruct, and underwent knowledge distillation (**KD**) using Llama 3.1 405B as the teacher model. It was extensively trained through continual pre-training (**CPT**) with a high-quality Korean dataset. The training pipeline was completed with supervised fine-tuning (**SFT**) and direct preference optimization (**DPO**) to align with human preferences and enhance instruction-following abilities.
-DNA 1.0 8B Instruct was fine-tuned on approximately 7B tokens of carefully curated data and has undergone extensive instruction tuning to enhance its ability to follow complex instructions and engage in natural conversations.
 <p align="center">
 <img src="assets/training-procedure.png" width="600" style="margin: 40px auto;">
 </p>

 **DNA 1.0 8B Instruct** is a <u>state-of-the-art (**SOTA**)</u> bilingual language model based on Llama architecture, specifically optimized for Korean language understanding and generation, while also maintaining strong English capabilities. The model was developed through a sophisticated process involving model merging via spherical linear interpolation (**SLERP**) with Llama 3.1 8B Instruct, and underwent knowledge distillation (**KD**) using Llama 3.1 405B as the teacher model. It was extensively trained through continual pre-training (**CPT**) with a high-quality Korean dataset. The training pipeline was completed with supervised fine-tuning (**SFT**) and direct preference optimization (**DPO**) to align with human preferences and enhance instruction-following abilities.
 <p align="center">
 <img src="assets/training-procedure.png" width="600" style="margin: 40px auto;">
 </p>