--- language: en tags: - llama - fine-tuning - lora - education - question-answering license: apache-2.0 models: - ibrahimBlyc/LA_Llama datasets: - ibrahimBlyc/LA_dataset_blyc library_name: transformers pipeline_tag: text-generation model_creator: ibrahimBlyc model_type: llama --- # Model Card: Fine-tuned LLaMA 3.2 Model ## Model Description This model is a fine-tuned version of LLaMA 3.2, designed specifically for tasks in the domain of **learning analytics** and **education systems improvement**. It has been trained on a carefully curated dataset that includes question-answer pairs and dialogue data, ensuring high-quality responses tailored to educational and analytical contexts. ### Key Features: - **Base Model**: LLaMA 3.2 - **Fine-tuning Approach**: Supervised fine-tuning with a question-answer structured dataset. - **Domains Covered**: Education systems, learning analytics, review/meta-analysis literature, and strategies for academic success. --- ## Training Data The fine-tuning dataset was crafted with precision to ensure the quality and relevance of the model's responses. The dataset contains thousands of entries with two primary formats: 1. **ShareGPT-style dialogues**: - Full discussions between a human and another actor (e.g., an AI) structured as interactive conversations. 2. **Alpaca-style question-answer pairs**: - Data structured with concise input and output information in a Q&A format. ### Dataset Creation Process: #### **1. Literature-Based Question-Answer Pairs:** - **Lens.org Collection**: - Papers filtered using keywords such as "review" and "meta-analysis". - Abstract sections were extracted for concise summaries of objectives, methods, and conclusions. - A Python program utilizing the Gemini API was used to generate relevant questions for each abstract. - **Data Size**: 14,000 question-answer pairs. - **Scopus.com Collection**: - Focused on the keyword "learning analytics." - An additional **8,000 question-answer pairs** were generated using the same methodology. #### **2. ChatGPT Recommendations for Education System Improvements:** - High-quality recommendations generated by ChatGPT on topics such as: - Reducing dropout rates. - Combating academic failure. - Supporting student success. - **Data Size**: 544 question-answer pairs. #### Example of Dataset: ```json [ { "instruction": "What are the key factors influencing student success?", "output": "Key factors include teacher effectiveness, parental involvement, and access to educational resources." }, { "instruction": "How can dropout rates be reduced?", "output": "Dropout rates can be reduced by implementing early intervention programs, providing mentorship opportunities, and addressing socio-economic barriers." } ] ``` ### Dataset Highlights: - Over **22,500 entries** spanning multiple sub-domains within education and learning analytics. - Data curated to ensure clarity, relevance, and high-quality question-answer pairs. --- ## Model Performance ### **Intended Use Cases** - **Education Research**: Assisting researchers and educators in analyzing learning trends and strategies. - **Learning Analytics**: Providing insights into educational systems, success factors, and intervention strategies. - **Academic Assistance**: Answering domain-specific questions in education. ### **Limitations** - The model is fine-tuned for education and learning analytics; its performance in unrelated domains may vary. - Limited coverage of topics outside the dataset's scope. --- ## Ethical Considerations - The model may reflect biases present in the training data, such as those inherent in academic literature or AI-generated content. - Users should verify critical outputs, especially in high-stakes scenarios such as policy-making or educational interventions. --- ## Citation If you use this model in your research or applications, please cite: ``` @misc{llama3_finetuned_education, title={Fine-tuned LLaMA 3.2 for Learning Analytics}, author={Ibrahim Belayachi}, year={2025}, howpublished={\url{https://huggingface.co/ibrahimBlyc/Llama_be_LA_}}, note={Fine-tuned on education and learning analytics datasets} } ``` --- ## Contact For questions or feedback, please contact Ibrahim Belayachi at ibrahim.belayachi@etu.utc.fr.