metadata

datasets:
  - argilla/10k_prompts_ranked_mistral_large_responses
  - OdiaGenAI/odia_domain_context_train_v1
  - OdiaGenAI/dolly-odia-15k
  - OdiaGenAI/OdiEnCorp_translation_instructions_25k
  - OdiaGenAI/Odia_Alpaca_instructions_52k
  - OdiaGenAI/hardcode_odia_qa_105
base_model:
  - OdiaGenAI-LLM/qwen_1.5_odia_7b

Qwen1.5 7B Odia Instruct

The original basemodel is Qwen/Qwen1.5-7B. This pretraining has then been continued on Odia language data by the OdiaGenAI organization. Finally, the model has been instruct-finetuned for 6 epochs on 5 Odia-language instruct datasets translated or produced by the OdiaGenAI organization, and 1 English instruction dataset.

The instruction tuning stage is documented in detail in this tutorial.