metadata
datasets:
- argilla/10k_prompts_ranked_mistral_large_responses
- OdiaGenAI/odia_domain_context_train_v1
- OdiaGenAI/dolly-odia-15k
- OdiaGenAI/OdiEnCorp_translation_instructions_25k
- OdiaGenAI/Odia_Alpaca_instructions_52k
- OdiaGenAI/hardcode_odia_qa_105
base_model:
- OdiaGenAI-LLM/qwen_1.5_odia_7b
Qwen1.5 7B Odia Instruct
The original basemodel is Qwen/Qwen1.5-7B. This pretraining has then been continued on Odia language data by the OdiaGenAI organization. Finally, the model has been instruct-finetuned for 6 epochs on 5 Odia-language instruct datasets translated or produced by the OdiaGenAI organization, and 1 English instruction dataset.
The instruction tuning stage is documented in detail in this tutorial.