SetFit with ppsingh/TAPP-multilabel-mpnet

This is a SetFit model that can be used for Text Classification. This SetFit model uses ppsingh/TAPP-multilabel-mpnet as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
NEGATIVE
  • '(p 70-1).Antigua and Barbuda’s 2021 update to the first Nationally Determined Contribution the most vulnerable in society have been predominantly focused on adaptation measures like building resilience to flooding and hurricanes. The updated NDC ambition provides an opportunity to focus more intently on enabling access to energy efficiency and renewable energy for the most vulnerable, particularly women who are most affected when electricity is not available since the grid is down after an extreme weather event. Nationally, Antigua and Barbuda intends to utilize the SIRF Fund as a mechanism primarily to catalyse and leverage investment in the transition for NGOs, MSMEs and informal sectors that normally cannot access traditional local commercial financing due to perceived high risks.'
  • 'The transport system cost will be increased by 16.2% compared to the BAU level. Electric trucks and electric pick-ups will account for the highest share of investment followed by electric buses and trucks. In the manufacturing industries, the energy efficiency improvement in the heating and the motor systems and the deployment of CCS require the highest investment in the non-metallic and the chemical industries in 2050. The manufacturing industries system cost will be increased by 15.3% compared to the BAU level.'
  • 'Figure 1-9: Total GHG emissions by sector (excluding LULUCF) 2000 and 2016 1.2.2 Greenhouse Gas Emission by Sector • Energy Total direct GHG emissions from the Energy sector in 2016 were estimated to be 253,895.61 eq. The majority of GHG emissions in the Energy sector were generated by fuel combustion, consisting mostly of grid-connected electricity and heat production at around eq (42.84%). GHG emissions from Transport, Manufacturing Industries and Construction, and other sectors were 68,260.17 GgCO2 eq eq (6.10%), respectively. Fugitive Emissions from fuel eq or a little over 4.33% of total GHG emissions from the Energy sector. Details of GHG emissions in the Energy sector by gas type and source in 2016 are presented in Figure 1-10. Source: Thailand Third Biennial Update Report, UNFCCC 2020.'
TARGET
  • 'DNPM, NFA,. Cocoa. Board,. Spice Board,. Provincial. gov-ernments. in the. Momase. region. Ongoing -. 2025. 340. European Union. Support committed. Priority Sector: Health. By 2030, 100% of the population benefit from introduced health measures to respond to malaria and other climate-sensitive diseases in PNG. Action or Activity. Indicator. Status. Lead. Implementing. Agencies. Supporting. Agencies. Time Frame. Budget (USD). Funding Source. (Existing/Potential). Other Support. Improve vector control. measures, with a priority. of all households having. access to a long-lasting. insecticidal net (LLIN).'
  • 'Conditionality: With national effort it is intended to increase the attention to vulnerable groups in case of disasters and/or emergencies up to 50% of the target and 100% of the target with international cooperation. Description: In this goal, it is projected to increase coverage from 33% to 50% (211,000 families) of agricultural insurance in attention to the number of families, whose crops were affected by various adverse weather events (flood, drought, frost, hailstorm, among others), in addition to the implementation of comprehensive actions for risk management and adaptation to Climate Change.'
  • 'By 2030, upgrade watershed health and vitality in at least 20 districts to a higher condition category. By 2030, create an inventory of wetlands in Nepal and sustainably manage vulnerable wetlands. By 2025, enhance the sink capacity of the landuse sector by instituting the Forest Development Fund (FDF) for compensation of plantations and forest restoration. Increase growing stock including Mean Annual Increment in Tarai, Hills and Mountains. Afforest/reforest viable public and private lands, including agroforestry.'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("ppsingh/iki_target_setfit")
# Run inference
preds = model("In the oil sector, the country has benefited from 372 million dollars for the reduction of gas flaring at the initiative (GGFR - \"Global Gas Flaring Reduction\") of the World Bank after having adopted in November 2015 a national reduction plan flaring and associated gas upgrading. In the electricity sector, the NDC highlights the development of hydroelectricity which should make it possible to cover 80% of production in 2025, the remaining 20% ​​being covered by gas and other renewable energies.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 58 116.6632 508
Label Training Sample Count
NEGATIVE 51
TARGET 44

Training Hyperparameters

  • batch_size: (8, 2)
  • num_epochs: (1, 0)
  • max_steps: -1
  • sampling_strategy: undersampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0018 1 0.3343 -
0.1783 100 0.0026 0.1965
0.3565 200 0.0001 0.1995
0.5348 300 0.0001 0.2105
0.7130 400 0.0001 0.2153
0.8913 500 0.0 0.1927

Training Results Classifier

  • Classes Representation in Test Data: Target: 9, Negative: 8
  • F1-score: 87.8%
  • Accuracy: 88.2%

Environmental Impact

Carbon emissions were measured using CodeCarbon.

  • Carbon Emitted: 0.006 kg of CO2
  • Hours Used: 0.185 hours

Training Hardware

  • On Cloud: No
  • GPU Model: 1 x Tesla T4
  • CPU Model: Intel(R) Xeon(R) CPU @ 2.00GHz
  • RAM Size: 12.67 GB

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.3.1
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.3.0
  • Tokenizers: 0.15.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
9
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ppsingh/iki_target_setfit

Finetuned
(1)
this model