Medgemma-pt-finetuned-multiclinsum

Model Description:

This model is a fine-tuned version of Google's MedGemma, specialized for abstractive summarization of clinical case reports in Portuguese. It was developed as part of our submission to the MultiClinSum 2025 shared task (Portuguese track), organized under the BioASQ Lab at CLEF.

Despite being compact (4B parameters), the model achieved strong semantic alignment with expert-generated summaries, as measured by BERTScore, and competitive results overall when compared to larger instruction-tuned models in zero-shot settings.

Training Details:

Base model: https://huggingface.co/unsloth/medgemma-4b-it
Dataset: Subset of the MultiClinSum Portuguese gold dataset (542 examples for training, 50 for validation)
Framework: Transformers + PEFT + LoRA (via Unsloth)

Framework versions

PEFT 0.15.2

Use Cases:

Clinical case summarization (Portuguese)
Biomedical NLP research
Low-resource summarization studies

Limitations:

Performance may vary outside of the clinical case report domain
Sensitive to prompt design
Trained on a small subset.

License

This model is fine-tuned from MedGemma, which is released under a non-commercial, research-only license by Google DeepMind.
As such, this model inherits the same licensing restrictions: it is intended for research purposes only, and cannot be used for commercial applications.

How to cite:

Soon

pucpr-br
/

medgemma-pt-finetuned-multiclinsum