File size: 2,856 Bytes
f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d 19ab8e1 f47e05d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
---
library_name: transformers
base_model: dbmdz/bert-base-german-uncased
license: mit
language:
- de
model-index:
- name: LernnaviBERT
results: []
---
# LernnaviBERT Model Card
LernnaviBERT is finetuning of [German BERT](https://huggingface.co/dbmdz/bert-base-german-uncased) on educational textual data from the Lernnavi Intelligent Tutoring Systems (ITS). It is trained on masked language modeling following the BERT training scheme.
### Model Sources
- **Repository:** [https://github.com/epfl-ml4ed/answer-forecasting](https://github.com/epfl-ml4ed/answer-forecasting)
- **Paper:** [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079)
### Direct Use
Being a fine-tuning of a base BERT model, LernnaviBERT is suitable for all BERT uses, especially in the educational domain in the German language.
### Downstream Use
LernnaviBERT has been fine-tuned for [MCQ answering](https://huggingface.co/epfl-ml4ed/MCQBert) and Student Answer Forecasting (like [MCQStudentBertCat](https://huggingface.co/epfl-ml4ed/MCQStudentBertCat) and [MCQStudentBertSum](https://huggingface.co/epfl-ml4ed/MCQStudentBertSum)) as described in [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079)
## Training Details
The model was trained on text data from a real-world ITS, Lernnavi, on ~40k text pieces for 3 epochs with a batch size of 16, going from an initial perplexity of 1.21 on Lernnavi data to a final perplexity of 1.01
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.0385 | 1.0 | 2405 | 0.0137 |
| 0.0142 | 2.0 | 4810 | 0.0084 |
| 0.0096 | 3.0 | 7215 | 0.0072 |
## Citation
If you find this useful in your work, please cite our paper
```
@misc{gado2024student,
title={Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning},
author={Elena Grazia Gado and Tommaso Martorella and Luca Zunino and Paola Mejia-Domenzain and Vinitra Swamy and Jibril Frej and Tanja Käser},
year={2024},
eprint={2405.20079},
archivePrefix={arXiv},
}
```
```
Gado, E., Martorella, T., Zunino, L., Mejia-Domenzain, P., Swamy, V., Frej, J., Käser, T. (2024).
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning.
In: Proceedings of the Conference on Educational Data Mining (EDM 2024).
```
### Framework versions
- Transformers 4.37.1
- Pytorch 2.2.0
- Datasets 2.2.1
- Tokenizers 0.15.1
|