odsbahia/odsbahia-ptbr

This model (ODSBahia-PTBR) serves to classify texts in relation to the 17 United Nations Sustainable Development Goals (SDGs), but it also covers the three new SDGs introduced in the document “2030 Agenda Guide: integrating SDGs, Education and Society”, prepared in 2020 in partnership between the University of Brasília (UnB) and the State University of São Paulo (UNESP,. "Racial Equality (SDG 18)", "Art, Culture and Communication (SDG 19)", and "Rights of Original Peoples and Traditional Communities (SDG 20)

image Source:https://gtagenda2030.org.br/ image Source:https://raizesds.com.br/pt/novos-ods/

Model Description

This text classification model was developed by tuning the neuralmind/bert-base-portuguese-cased pre-trained model. The training data for this fine-tuned model was collected specifically for this task. This model was made as part of academic research at the Federal University of Southern Bahia in Brazil. The goal was to create a text classification model against the 20 SDGs based on Transformers that anyone could use. The main model details are highlighted below:

  • Model type: Text classification
  • Language(s) (NLP): Portuguese
  • License: mit
  • Finetuned from model [optional]: neuralmind/bert-base-portuguese-cased

Direct Use

This is a fine-tuned model and therefore does not require additional training.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("odsbahia/odsbahia-ptbr")
model = AutoModelForSequenceClassification.from_pretrained("odsbahia/odsbahia-ptbr")

Training Data

Training data includes texts from a wide variety of sources and academic research fields. Therefore, this adjusted model is not intended for a specific sector.

Training Hyperparameters

  • SEED = 42
  • MAX_LEN = 128
  • BATCH_SIZE = 16
  • NUM_EPOCHS = 34
  • LEARNING_RATE = 9.46e-05
  • WEIGHT_DECAY_RATE = 1.21e-05

Evaluation

Metrics

  • Accuracy = 0.98 sobre a base de dados (OSDG-CD)
  • Precision = 0.89
  • Recall = 0.70
  • F1-Score = 0.79

Citation

Santos , Êmeris S., & Moraes , L. E. (2024). ODSBAHIA-PTBR: A Natural Language Processing Model to Support Sustainable Development Goals. Revista De Gestão Social E Ambiental, 18(12), e010230. https://doi.org/10.24857/rgsa.v18n12-039

Model Card Contact

[email protected]

Downloads last month
58
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW

Space using odsbahia/odsbahia-ptbr 1