File size: 2,991 Bytes
8e86250
 
 
 
 
 
 
 
 
 
 
ff55aae
8e86250
ff55aae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ddebcc8
 
 
 
 
07f7025
8d77dcc
ddebcc8
 
 
 
 
 
5680d67
ddebcc8
 
 
 
 
ec116c4
 
 
 
 
ab3a5ea
ec116c4
 
 
 
 
d7ed6ea
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
datasets:
- samirmsallem/wiki_def_de_multitask
language:
- de
base_model:
- FacebookAI/xlm-roberta-base
library_name: transformers
tags:
- science
- ner
- def_extraction
- definitions
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: checkpoints
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: samirmsallem/wiki_def_de_multitask
      type: samirmsallem/wiki_def_de_multitask
    metrics:
    - name: F1
      type: f1
      value: 0.8262004492199356
    - name: Precision
      type: precision
      value: 0.8189914550487424
    - name: Recall
      type: recall
      value: 0.8335374816266536
    - name: Loss
      type: loss
      value: 0.312337189912796
---


## NER model for definition component recognition in German scientific texts

**xlm-roberta-base-definitions_ner** is a NER model (token classification) in the scientific domain in German, finetuned from the model [xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base). 
It was trained using a custom annotated dataset of around 10,000 training and 2,000 test examples containing definition- and non-definition-related sentences from wikipedia articles in german.

The model is specifically designed to recognize and classify components of definitions, using the following entity labels:
- **DF**: Definiendum (the term being defined)
- **VF**: Definitor (the verb or phrase introducing the definition)
- **GF**: Definiens (the explanation or meaning)

Training was conducted using a standard NER objective. The model achieves an F1 score of approximately 83% on the evaluation set.

Here are the overall final metrics on the test dataset after 5 epochs of training:
  - **f1**: 0.8262004492199356
  - **precision**: 0.8189914550487424
  - **recall**: 0.8335374816266536
  - **loss**: 0.312337189912796


## Model Performance Comparision on wiki_definitions_de_multitask:

| Model | Precision | Recall | F1 Score |  Eval Samples per Second | Epoch |
| --- | --- | --- | --- | --- | --- |
| [distilbert-base-multilingual-cased-definitions_ner](https://huggingface.co/samirmsallem/distilbert-base-multilingual-cased-definitions_ner/) | 80.76 | 81.74 | 81.25 | **457.53** | 5.0 |
| [scibert_scivocab_cased-definitions_ner](https://huggingface.co/samirmsallem/scibert_scivocab_cased-definitions_ner) | 80.54 | 82.11 | 81.32 | 236.61 | 4.0 |
| [GottBERT_base_best-definitions_ner](https://huggingface.co/samirmsallem/GottBERT_base_best-definitions_ner) | **82.98** | 82.81 | 82.90 | 272.26 | 5.0 |
| [xlm-roberta-base-definitions_ner](https://huggingface.co/samirmsallem/xlm-roberta-base-definitions_ner) | 81.90 | 83.35 | 82.62 | 241.21 | 5.0 |
| [gbert-base-definitions_ner](https://huggingface.co/samirmsallem/gbert-base-definitions_ner) | 82.73 | **83.56** | **83.14** | 278.87 | 5.0 |
| [gbert-large-definitions_ner](https://huggingface.co/samirmsallem/gbert-large-definitions_ner) | 80.67 | 83.36 | 81.99 | 109.83 | 2.0 |