fine-tuned-distilbert-emotion / README.md

Update README.md

1c5cb50 verified about 1 month ago

4.63 kB

	---
	pipeline_tag: text-classification
	library_name: transformers
	license: apache-2.0
	base_model: distilbert-base-uncased
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	datasets:
	- dair-ai/emotion
	language:
	- en
	model-index:
	- name: results
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# results

	This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1699
	- Accuracy: 0.941

	## Model Description

	This model is a fine-tuned version of [DistilBERT-base-uncased](https://huggingface.co/distilbert-base-uncased), tailored for emotion recognition in text. It classifies input text into one of six emotion categories: sadness, joy, love, anger, fear, and surprise. The fine-tuning was performed on the [dair-ai/emotion](https://huggingface.co/datasets/dair-ai/emotion) dataset, which includes 20,000 labeled text-emotion pairs. DistilBERT, being a smaller and faster variant of BERT, ensures this model is efficient while delivering robust performance for emotion classification tasks.

	- Model Type: Text Classification
	- Base Model: [DistilBERT-base-uncased](https://huggingface.co/distilbert-base-uncased)
	- Fine-Tuning Task: Emotion Recognition (6 classes)
	- Languages: English
	- License: [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)

	## Intended Uses & Limitations

	### Intended Uses
	- Emotion Classification: Classify text into one of six emotions: sadness, joy, love, anger, fear, or surprise.
	- Sentiment Analysis: Infer sentiment (e.g., joy as positive, anger as negative) based on predicted emotions, though not explicitly trained for this purpose.
	- Chatbots and Virtual Assistants: Enhance conversational AI by detecting user emotions for empathetic responses.
	- Content Moderation: Identify content with strong emotions like anger or fear for moderation purposes.

	### Limitations
	- Emotion Granularity: Restricted to six emotions, potentially missing nuanced or complex emotional states.
	- Contextual Understanding: May struggle with sarcasm, irony, or emotions requiring deeper context.
	- Language: Trained on English text only, with limited performance on other languages.
	- Dataset Bias: Performance may reflect biases in the training data, such as underrepresentation of certain emotional expressions.
	- Short Texts: Suboptimal performance on very short inputs (e.g., single words) due to limited context.

	## Training and Evaluation Data

	The model was fine-tuned on the [dair-ai/emotion](https://huggingface.co/datasets/dair-ai/emotion) dataset, comprising 20,000 English text samples labeled with one of six emotions:
	- 0: sadness
	- 1: joy
	- 2: love
	- 3: anger
	- 4: fear
	- 5: surprise

	The dataset is divided as follows:
	- Training Set: 16,000 samples
	- Validation Set: 2,000 samples
	- Test Set: 2,000 samples

	The dataset is balanced across the six emotion classes, promoting effective learning for each category.

	## Training Procedure

	### Preprocessing
	- Tokenization: Text was tokenized using the DistilBERT tokenizer, with a maximum sequence length of 512 tokens. Padding and truncation ensured uniform input sizes.
	- Data Formatting: Converted to PyTorch tensors for training compatibility.

	## Demo
	Try the model in action [here](https://huggingface.co/spaces/YonasMersha/emotion-classifier).

	### Training Hyperparameters
	Fine-tuning was conducted using the Hugging Face `Trainer` API with:
	- Epochs: 3
	- Batch Size: 16 (training), 64 (evaluation)
	- Learning Rate: 2e-5
	- Optimizer: AdamW
	- Weight Decay: 0.01
	- Warmup Steps: 500

	### Training Process
	- Loss Function: Cross-entropy loss for multi-class classification.
	- Evaluation Metric: Accuracy on the validation set.
	- Training Duration: 3 epochs, with logging every 10 steps.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 3

	### Training results



	### Framework versions

	- Transformers 4.51.3
	- Pytorch 2.6.0+cu124
	- Datasets 3.5.1
	- Tokenizers 0.21.1