yqnis
/

llama3-8b-quaero

Text Generation

text-generation-inference

Model card Files Files and versions Community

yqnis commited on Apr 2

Commit

bc678b4

·

verified ·

1 Parent(s): d18d34e

Update README.md

Files changed (1) hide show

README.md +16 -7

README.md CHANGED Viewed

@@ -16,8 +16,8 @@ The model was trained to extract entities from French biomedical sentences using
 ## Dataset
-The original dataset is Quaero French Medical Corpus.
-It was converted to a JSON format for generative instruction-style training.
 ```json
 {
@@ -26,13 +26,22 @@ It was converted to a JSON format for generative instruction-style training.
 }
 ```
 ## Evaluation
 Evaluation was performed on the test split comparing 2 sets, the predicted one and the ground truth one :
-Metric     |    Score
-----------------------
-Precision  |   0.6482
-Recall     |   0.6951
-F1 Score   |   0.6709

 ## Dataset
+The original dataset is Quaero French Medical Corpus and I converted it to a JSON format for generative instruction-style training.
+I used "<>" as a separator and the format is : 'TAG_1 entity_1 <> TAG_2 entity_2 <> ... <> TAG_n entity_n'.
 ```json
 {
 }
 ```
+The QUAERO French Medical corpus features **overlapping entity spans**, including nested structures, for instance :
+```json
+{
+  "input": "Cancer du pancréas",
+  "output": "DISO Cancer <> DISO Cancer du pancréas <> ANAT pancréas"
+}
+```
 ## Evaluation
 Evaluation was performed on the test split comparing 2 sets, the predicted one and the ground truth one :
+| Metric    | Score  |
+| --------- | ------ |
+| Precision | 0.6482 |
+| Recall    | 0.6951 |
+| F1 Score  | 0.6709 |