File size: 910 Bytes
15ba2dc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
datasets:
- argilla/10k_prompts_ranked_mistral_large_responses
- OdiaGenAI/odia_domain_context_train_v1
- OdiaGenAI/dolly-odia-15k
- OdiaGenAI/OdiEnCorp_translation_instructions_25k
- OdiaGenAI/Odia_Alpaca_instructions_52k
- OdiaGenAI/hardcode_odia_qa_105
base_model:
- OdiaGenAI-LLM/qwen_1.5_odia_7b
---
# Qwen1.5 7B Odia Instruct

The original basemodel is [Qwen/Qwen1.5-7B](https://huggingface.co/Qwen/Qwen1.5-7B). 
This pretraining has then been continued on Odia language data by the [OdiaGenAI](https://huggingface.co/OdiaGenAI) organization.
Finally, the model has been instruct-finetuned for 6 epochs on 5 Odia-language instruct datasets translated or produced by the OdiaGenAI organization, 
and 1 English instruction dataset. 

The instruction tuning stage is documented in detail in [this tutorial](https://silogen.github.io/ai-workloads/docs/tutorials/tutorial-02-language-extension-finetune/).