yqnis
/

llama3-8b-quaero-lora

@@ -1,22 +1,83 @@
 ---
-base_model: unsloth/meta-llama-3.1-8b-instruct-unsloth-bnb-4bit
 tags:
 - text-generation-inference
 - transformers
 - unsloth
 - llama
 - trl
 license: apache-2.0
 language:
-- en
 ---
-# Uploaded  model
-- **Developed by:** yqnis
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/meta-llama-3.1-8b-instruct-unsloth-bnb-4bit
-This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+base_model: unsloth/Meta-Llama-3.1-8B-Instruct
 tags:
 - text-generation-inference
 - transformers
 - unsloth
 - llama
 - trl
+- qlora
 license: apache-2.0
 language:
+- fr
 ---
+# LLaMA 3 8B fine-tuned on Quaero for Named Entity Recognition (Generative)
+This is a **LoRA adapter** version of [unsloth/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/unsloth/Meta-Llama-3.1-8B-Instruct), fine-tuned on the [Quaero French medical dataset](https://quaerofrenchmed.limsi.fr/) using a **generative approach to Named Entity Recognition (NER)**.
+## Task
+The model was trained to extract entities from French biomedical sentences (medlines) using a structured, prompt-based format.
+| Tag    | Description                                                 |
+| ------ | ----------------------------------------------------------- |
+| `DISO` | **Diseases** or health-related conditions                   |
+| `ANAT` | **Anatomical parts** (organs, tissues, body regions, etc.)  |
+| `PROC` | **Medical or surgical procedures**                          |
+| `DEVI` | **Medical devices or instruments**                          |
+| `CHEM` | **Chemical substances or medications**                      |
+| `LIVB` | **Living beings** (e.g. humans, animals, bacteria, viruses) |
+| `GEOG` | **Geographical locations** (e.g. countries, regions)        |
+| `OBJC` | **Physical objects** not covered by other categories        |
+| `PHEN` | **Biological processes** (e.g. inflammation, mutation)      |
+| `PHYS` | **Physiological functions** (e.g. respiration, vision)      |
+I used `<>` as a separator and the output format is :
+```
+TAG_1 entity_1 <> TAG_2 entity_2 <> ... <> TAG_n entity_n'
+```
+## Dataset
+The original dataset is Quaero French Medical Corpus and I converted it to a JSON format for generative instruction-style training.
+```json
+{
+  "input": "Etude de l'efficacité et de la tolérance de la prazosine à libération prolongée chez des patients hypertendus et diabétiques non insulinodépendants.",
+  "output": "DISO tolérance <> CHEM prazosine <> LIVB patients <> DISO hypertendus <> DISO diabétiques non insulinodépendants"
+}
+```
+The QUAERO French Medical corpus features **overlapping entity spans**, including nested structures, for instance :
+```json
+{
+  "input": "Cancer du pancréas",
+  "output": "DISO Cancer <> DISO Cancer du pancréas <> ANAT pancréas"
+}
+```
+## Evaluation
+Evaluation was performed on the test split by comparing the predicted entity set against the ground truth annotations using exact (type, entity) matching.
+| Metric    | Score  |
+| --------- | ------ |
+| Precision | 0.6827 |
+| Recall    | 0.7121 |
+| F1 Score  | 0.6971 |
+## Other formats
+This model is also available in the following formats:
+- **16-bit**
+  → [yqnis/llama3-8b-quaero](https://huggingface.co/yqnis/llama3-8b-quaero)
+- **GGUF Q8_0**
+  → [yqnis/llama3-8b-quaero-gguf](https://huggingface.co/yqnis/llama3-8b-quaero-gguf)
+This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.