---
language:
- en
- ko
license: cc-by-nc-4.0
tags:
- dnotitia
- nlp
- llm
- slm
- conversation
- chat
- gguf
base_model:
- dnotitia/Llama-DNA-1.0-8B-Instruct
library_name: transformers
pipeline_tag: text-generation
---

# DNA 1.0 8B Instruct

<p align="center">
<img src="assets/dna-logo.png" width="400" style="margin: 40px auto;">
</p>

**DNA 1.0 8B Instruct** is a <u>state-of-the-art (**SOTA**)</u> bilingual language model based on Llama architecture, specifically optimized for Korean language understanding and generation, while also maintaining strong English capabilities. The model was developed through a sophisticated process involving model merging via spherical linear interpolation (**SLERP**) with Llama 3.1 8B Instruct, and underwent knowledge distillation (**KD**) using Llama 3.1 405B as the teacher model. It was extensively trained through continual pre-training (**CPT**) with a high-quality Korean dataset. The training pipeline was completed with supervised fine-tuning (**SFT**) and direct preference optimization (**DPO**) to align with human preferences and enhance instruction-following abilities.

DNA 1.0 8B Instruct was fine-tuned on approximately 10B tokens of carefully curated data and has undergone extensive instruction tuning to enhance its ability to follow complex instructions and engage in natural conversations.

- **Developed by:** Dnotitia Inc.
- **Supported Languages:** Korean, English
- **Model Release Date:** Dec 10, 2024
- **Vocab Size:** 128,256
- **Context Length:** 131,072 tokens (128k)
- **License:** CC BY-NC 4.0

<p align="center">
<img src="assets/training-procedure.png" width="600" style="margin: 40px auto;">
</p>

## Quickstart

We offer weights in `F32`, `F16` formats and quantized weights in `Q8_0`, `Q6_K`, `Q5_K`, `Q4_K`, `Q3_K` and `Q2_K` formats.

You can run GGUF weights with `llama.cpp` as follows:

1. Install `llama.cpp`. Please refer to the [llama.cpp repository](https://github.com/ggerganov/llama.cpp) for more details.

2. Download DNA 1.0 8B Instruct model in GGUF format.

```bash
# Install huggingface_hub if not already installed
$ pip install huggingface_hub[cli]

# Download the GGUF weights
$ huggingface-cli download dnotitia/Llama-DNA-1.0-8B-Instruct-GGUF \
    --include "Llama-DNA-1.0-8B-Instruct-Q8_0.gguf" \
    --local-dir .
```

3. Run the model with `llama.cpp` in conversational mode.

```bash
$ llama-cli -cnv -m ./Llama-DNA-1.0-8B-Instruct-Q8_0.gguf \
    -p "You are a helpful assistant, Dnotitia DNA."
```

## Run Locally

For end users, we introduce two ways to run DNA 1.0 8B Instruct model locally.

> **Note**
>
> We recommend using a repetition penalty not exceeding 1.0 for better generation quality.

### llama.cpp

You can run DNA 1.0 8B Instruct model with `llama.cpp` as follows:

1. Install `llama.cpp`. Please refer to the [llama.cpp repository](https://github.com/ggerganov/llama.cpp) for more details.

2. Download DNA 1.0 8B Instruct model in GGUF format.

```bash
huggingface-cli download dnotitia/Llama-DNA-1.0-8B-Instruct-GGUF \
    --include "DNA-1.0-8B-Instruct-BF16*.gguf" \
    --local-dir .
```

3. Run the model with `llama.cpp` in conversational mode.

```bash
llama-cli -cnv -m ./DNA-1.0-8B-Instruct-BF16.gguf \
    -p "You are a helpful assistant, Dnotitia DNA."
```

### Ollama

DNA 1.0 8B Instruct model is compatible with Ollama. You can use it as follows:

1. Install Ollama. Please refer to the [Ollama repository](https://github.com/ollama/ollama) for more details.

2. Create a `Modelfile` for DNA 1.0 8B Instruct.

```text
# Model path (choose appropriate GGUF weights)
FROM ./DNA-1.0-8B-Instruct-BF16.gguf

# Parameter values
PARAMETER stop "<|endoftext|>"
PARAMETER repeat_penalty 1.0
# PARAMETER num_ctx 131072  # if you need a long context

# Chat template
TEMPLATE """{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{ if eq .Role "system" }}[|system|]{{ .Content }}[|endoftext|]
{{ continue }}
{{ else if eq .Role "user" }}[|user|]{{ .Content }}
{{ else if eq .Role "assistant" }}[|assistant|]{{ .Content }}[|endoftext|]
{{ end }}
{{- if and (ne .Role "assistant") $last }}[|assistant|]{{ end }}
{{- end -}}"""

# System prompt
SYSTEM """You are a helpful assistant, Dnotitia DNA."""

# License
LICENSE """CC BY-NC 4.0"""
```

3. Convert the model to Ollama.

```bash
ollama create dna -f Modelfile
```

4. Run the model with Ollama.

```bash
ollama run dna
```

<br>

## Limitations

While DNA 1.0 8B Instruct demonstrates strong performance, users should be aware of the following limitations:

- The model may occasionally generate biased or inappropriate content.
- Responses are based on training data and may not reflect current information.
- The model may sometimes produce factually incorrect or inconsistent answers.
- Performance may vary depending on the complexity and domain of the task.
- Generated content should be reviewed for accuracy and appropriateness.

<br>

## License

The model is released under the [CC BY-NC 4.0 license](./LICENSE). For commercial usage inquiries, please [Contact us](https://www.dnotitia.com/contact/post-form).

<br>

## Citation

If you use or discuss this model in your academic research, please cite the project to help spread awareness:

```
@article{dnotitiadna2024,
  title = {Dnotitia DNA 1.0 8B Instruct},
  author = {Jungyup Lee, Jemin Kim, Sang Park, Seungjae Lee},
  year = {2024},
  url = {https://huggingface.co/dnotitia/DNA-1.0-8B-Instruct},
  version = {1.0},
}
```

<br>

## Contact

For technical support and inquiries: [Contact us](https://www.dnotitia.com/contact/post-form)