SPECTER2-base Multilabel Horizon Clusters Classifier
This model is based on SPECTER2-base, fine-tuned for multilabel classification of scientific publications into Horizon Europe clusters.
Model Description
- Base model: allenai/specter2_base
- Task: Multilabel classification (assigns one or more clusters per document)
- Labels: 6 Horizon Europe clusters (see below)
- Languages: English
- Input: Title and abstract concatenated
Training Details
- Training framework: Hugging Face Transformers (
Trainer
) - Batch size: 4
- Learning rate: 2e-5
- Epochs: 6
- Optimizer: AdamW with weight decay 0.01
- Loss: Binary Cross-Entropy with Logits
- Best model selection: F1-score on validation set
Clusters (Labels)
- Civil Security for Society
- Climate, Energy and Mobility
- Culture, Creativity and Inclusive Society
- Digital, Industry and Space
- Food, Bioeconomy, Natural Resources, Agriculture and Environment
- Health
Evaluation Metrics
Epoch | Training Loss | Validation Loss | F1 | ROC AUC | Accuracy |
---|---|---|---|---|---|
1 | No log | 0.1774 | 0.910 | 0.9368 | 0.766 |
2 | 0.0606 | 0.1849 | 0.921 | 0.9454 | 0.787 |
3 | 0.0351 | 0.2071 | 0.919 | 0.9434 | 0.787 |
4 | 0.0180 | 0.2191 | 0.921 | 0.9451 | 0.793 |
5 | 0.0093 | 0.2295 | 0.921 | 0.9451 | 0.793 |
6 | 0.0060 | 0.2307 | 0.921 | 0.9451 | 0.793 |
Best epoch: 6 (highest F1 and accuracy, last improvement at epoch 4)
- Final validation loss: 0.2307
- Final F1: 0.9212
- Final ROC AUC: 0.9451
- Final Accuracy: 0.7927
Per-Category Classification Report
Label | Precision | Recall | F1-score | Support |
---|---|---|---|---|
Civil Security for Society | 0.97 | 0.79 | 0.87 | 39 |
Climate, Energy and Mobility | 0.94 | 0.91 | 0.93 | 91 |
Culture, Creativity and Inclusive Society | 0.89 | 0.88 | 0.88 | 96 |
Digital, Industry and Space | 0.93 | 0.92 | 0.93 | 214 |
Food, Bioeconomy, Natural Resources, Agriculture and Environment | 0.89 | 0.97 | 0.93 | 75 |
Health | 0.96 | 0.96 | 0.96 | 73 |
micro avg | 0.93 | 0.91 | 0.92 | 588 |
macro avg | 0.93 | 0.91 | 0.92 | 588 |
weighted avg | 0.93 | 0.91 | 0.92 | 588 |
samples avg | 0.91 | 0.92 | 0.90 | 588 |
License
This model is licensed under the Apache License 2.0.
- Downloads last month
- 24
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for nicolauduran45/horizon-clusters-classifier
Base model
allenai/specter2_base