File size: 2,908 Bytes
054564a 09f8766 d2c0921 09f8766 5f440e8 09f8766 dac7e37 62473a3 dac7e37 62473a3 dac7e37 62473a3 dac7e37 62473a3 09f8766 b490aff 054564a 09f8766 81d2960 09f8766 ede77c2 09f8766 81d2960 054564a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
language: es
datasets:
- squad_es
- hackathon-pln-es/biomed_squad_es_v2
metrics:
- "f1"
---
# roberta-base-biomedical-clinical-es for QA
This model was trained as part of the "Extractive QA Biomedicine" project developed during the 2022 [Hackathon](https://somosnlp.org/hackathon) organized by SOMOS NLP.
## Motivation
Recent research has made available Spanish Language Models trained on Biomedical corpus. This project explores the use of these new models to generate extractive Question Answering models for Biomedicine, and compares their effectiveness with general masked language models.
The models trained during the [Hackathon](https://somosnlp.org/hackathon) were:
[hackathon-pln-es/roberta-base-bne-squad2-es](https://huggingface.co/hackathon-pln-es/roberta-base-bne-squad2-es)
[hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es](https://huggingface.co/hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es)
[hackathon-pln-es/roberta-base-biomedical-es-squad2-es](https://huggingface.co/hackathon-pln-es/roberta-base-biomedical-es-squad2-es)
[hackathon-pln-es/biomedtra-small-es-squad2-es](https://huggingface.co/hackathon-pln-es/biomedtra-small-es-squad2-es)
## Description
This model is a fine-tuned version of [PlanTL-GOB-ES/roberta-base-biomedical-clinical-es](https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es) on the [squad_es (v2)](https://huggingface.co/datasets/squad_es) training dataset.
## Hyperparameters
The hyperparameters were chosen based on those used in [PlanTL-GOB-ES/roberta-base-bne-sqac](https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne-sqac), a spanish-based QA model trained on a dataset with SQUAD v1 fromat.
```
--num_train_epochs 2
--learning_rate 3e-5
--weight_decay 0.01
--max_seq_length 386
--doc_stride 128
```
## Performance
Evaluated on the [hackathon-pln-es/biomed_squad_es_v2](https://huggingface.co/datasets/hackathon-pln-es/biomed_squad_es_v2) dev set.
|Model |Base Model Domain|exact |f1 |HasAns_exact|HasAns_f1|NoAns_exact|NoAns_f1|
|--------------------------------------------------------------|-----------------|-------|-------|------------|---------|-----------|--------|
|hackathon-pln-es/roberta-base-bne-squad2-es |General |67.6341|75.6988|53.7367 |70.0526 |81.2174 |81.2174 |
|hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es|Biomedical |66.8426|75.2346|53.0249 |70.0031 |80.3478 |80.3478 |
|hackathon-pln-es/roberta-base-biomedical-es-squad2-es |Biomedical |67.6341|74.5612|47.6868 |61.7012 |87.1304 | 87.1304|
|hackathon-pln-es/biomedtra-small-es-squad2-es |Biomedical |34.4767|44.3294|45.3737 |65.307 |23.8261 |23.8261 |
## Team
Santiago Maximo: [smaximo](https://huggingface.co/smaximo) |