datasets:
- samirmsallem/argumentative_student_peer_reviews
language:
- de
metrics:
- accuracy
base_model:
- deepset/gbert-base
pipeline_tag: text-classification
library_name: transformers
model-index:
- name: checkpoints
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: samirmsallem/argumentative_student_peer_reviews
type: samirmsallem/argumentative_student_peer_reviews
metrics:
- name: Accuracy
type: accuracy
value: 0.7708649468892261
Text classification model for claim/premise detection in essay feedback
gbert-base-claim_premise is a text classification model in the non-scientific domain in German, finetuned from the model gbert-base. It was trained using a annotated dataset containing claim and premise sentences from essay feedback. The dataset was created by T. Wambsganss, C. Niklaus, M. Söllner, S. Handschuh and J. M. Leimeister and is available here.
Training
Training was conducted on a 10 epoch fine-tuning approach, however this repository contains the results of the second epoch, since it has the best accuracy:
Epoch | Eval Loss | Accuracy |
---|---|---|
1.0 | 0.4946 | 0.7621 |
2.0 | 0.5074 | 0.7709 |
3.0 | 0.8148 | 0.7627 |
4.0 | 1.1393 | 0.7560 |
5.0 | 1.3645 | 0.7551 |
6.0 | 1.5397 | 0.7560 |
7.0 | 1.8195 | 0.7548 |
8.0 | 2.0723 | 0.7536 |
9.0 | 2.0844 | 0.7566 |
10.0 | 2.1382 | 0.7563 |
In relation to the dataset, the model demonstrates that it can effectively learn to distinguish between the two classes claim and premise. However, the rapid onset of overfitting after epoch 2 suggests that the dataset is imbalanced and noisy. Further work should enable the model to be trained on more robust data to ensure better evaluation results.
Text Classification Tags
Text Classification Tag | Text Classification Label |
---|---|
0 | CLAIM |
1 | PREMISE |