yqnis commited on
Commit
bc678b4
·
verified ·
1 Parent(s): d18d34e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -7
README.md CHANGED
@@ -16,8 +16,8 @@ The model was trained to extract entities from French biomedical sentences using
16
 
17
  ## Dataset
18
 
19
- The original dataset is Quaero French Medical Corpus.
20
- It was converted to a JSON format for generative instruction-style training.
21
 
22
  ```json
23
  {
@@ -26,13 +26,22 @@ It was converted to a JSON format for generative instruction-style training.
26
  }
27
  ```
28
 
 
 
 
 
 
 
 
 
 
29
  ## Evaluation
30
 
31
  Evaluation was performed on the test split comparing 2 sets, the predicted one and the ground truth one :
32
 
33
- Metric | Score
34
- ----------------------
35
- Precision | 0.6482
36
- Recall | 0.6951
37
- F1 Score | 0.6709
38
 
 
16
 
17
  ## Dataset
18
 
19
+ The original dataset is Quaero French Medical Corpus and I converted it to a JSON format for generative instruction-style training.
20
+ I used "<>" as a separator and the format is : 'TAG_1 entity_1 <> TAG_2 entity_2 <> ... <> TAG_n entity_n'.
21
 
22
  ```json
23
  {
 
26
  }
27
  ```
28
 
29
+ The QUAERO French Medical corpus features **overlapping entity spans**, including nested structures, for instance :
30
+ ```json
31
+ {
32
+ "input": "Cancer du pancréas",
33
+ "output": "DISO Cancer <> DISO Cancer du pancréas <> ANAT pancréas"
34
+ }
35
+ ```
36
+
37
+
38
  ## Evaluation
39
 
40
  Evaluation was performed on the test split comparing 2 sets, the predicted one and the ground truth one :
41
 
42
+ | Metric | Score |
43
+ | --------- | ------ |
44
+ | Precision | 0.6482 |
45
+ | Recall | 0.6951 |
46
+ | F1 Score | 0.6709 |
47