Update README.md
Browse files
README.md
CHANGED
@@ -14,12 +14,29 @@ This model is a 16-bit merged version of [unsloth/Meta-Llama-3.1-8B-Instruct](ht
|
|
14 |
|
15 |
The model was trained to extract entities from French biomedical sentences (medlines) using a structured, prompt-based format.
|
16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
## Dataset
|
18 |
|
19 |
The original dataset is Quaero French Medical Corpus and I converted it to a JSON format for generative instruction-style training.
|
20 |
|
21 |
-
I used `<>` as a separator and the format is : `'TAG_1 entity_1 <> TAG_2 entity_2 <> ... <> TAG_n entity_n'`.
|
22 |
-
|
23 |
```json
|
24 |
{
|
25 |
"input": "Etude de l'efficacité et de la tolérance de la prazosine à libération prolongée chez des patients hypertendus et diabétiques non insulinodépendants.",
|
|
|
14 |
|
15 |
The model was trained to extract entities from French biomedical sentences (medlines) using a structured, prompt-based format.
|
16 |
|
17 |
+
| Tag | Description |
|
18 |
+
| ------ | ----------------------------------------------------------- |
|
19 |
+
| `DISO` | **Diseases** or health-related conditions |
|
20 |
+
| `ANAT` | **Anatomical parts** (organs, tissues, body regions, etc.) |
|
21 |
+
| `PROC` | **Medical or surgical procedures** |
|
22 |
+
| `DEVI` | **Medical devices or instruments** |
|
23 |
+
| `CHEM` | **Chemical substances or medications** |
|
24 |
+
| `LIVB` | **Living beings** (e.g. humans, animals, bacteria, viruses) |
|
25 |
+
| `GEOG` | **Geographical locations** (e.g. countries, regions) |
|
26 |
+
| `OBJC` | **Physical objects** not covered by other categories |
|
27 |
+
| `PHEN` | **Biological processes** (e.g. inflammation, mutation) |
|
28 |
+
| `PHYS` | **Physiological functions** (e.g. respiration, vision) |
|
29 |
+
|
30 |
+
I used `<>` as a separator and the output format is :
|
31 |
+
|
32 |
+
```
|
33 |
+
TAG_1 entity_1 <> TAG_2 entity_2 <> ... <> TAG_n entity_n'
|
34 |
+
```
|
35 |
+
|
36 |
## Dataset
|
37 |
|
38 |
The original dataset is Quaero French Medical Corpus and I converted it to a JSON format for generative instruction-style training.
|
39 |
|
|
|
|
|
40 |
```json
|
41 |
{
|
42 |
"input": "Etude de l'efficacité et de la tolérance de la prazosine à libération prolongée chez des patients hypertendus et diabétiques non insulinodépendants.",
|