Here's an updated Model Card in a README format based on the training results and the model you've used (ModernBERT-large for Turkish sentiment analysis):

# Turkish Sentiment ModernBERT-large

This is a fine-tuned ModernBERT-large model for Turkish Sentiment Analysis. The model was trained on the winvoker/turkish-sentiment-analysis-dataset and is designed to classify Turkish text into sentiment categories such as positive, negative, and neutral.

Model Overview

  • Model Type: ModernBERT (BERT variant)
  • Task: Sentiment Analysis
  • Languages: Turkish
  • Dataset: winvoker/turkish-sentiment-analysis-dataset
  • Labels: Positive, Negative, Neutral
  • Fine-Tuning: Fine-tuned for sentiment classification.

Performance Metrics

The model was trained for 4 epochs with the following results:

Epoch Training Loss Validation Loss Accuracy F1 Score
1 0.2884 0.1133 95.72% 92.18%
2 0.1759 0.1050 96.24% 93.33%
3 0.0633 0.1233 96.14% 93.19%
4 0.0623 0.1213 96.14% 93.19%
  • Training Loss: Measures how well the model fits the training data.
  • Validation Loss: Measures how well the model generalizes to unseen data.
  • Accuracy: Percentage of correct predictions over all examples.
  • F1 Score: A balanced metric between precision and recall, accounting for both false positives and false negatives.

Model Inference Example

You can use this model for sentiment analysis of Turkish text. Here’s an example of how to use it:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load the pre-trained model and tokenizer
model_name = "bayrameker/Turkish-sentiment-ModernBERT-large"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example texts for prediction
texts = ["bu ürün çok iyi", "bu ürün berbat"]

# Tokenize the inputs
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")

# Make predictions
with torch.no_grad():
    logits = model(**inputs).logits

# Get the predicted sentiment labels
predictions = torch.argmax(logits, dim=-1)
labels = ["Negative", "Neutral", "Positive"]  # Adjust based on your label mapping
for text, pred in zip(texts, predictions):
    print(f"Text: {text} -> Sentiment: {labels[pred.item()]}")

Example Output:

Text: bu ürün çok iyi -> Sentiment: Positive
Text: bu ürün berbat -> Sentiment: Negative

Installation

To use this model, install the following dependencies:

pip install transformers
pip install torch
pip install datasets

Model Card

  • Model Name: Turkish-sentiment-ModernBERT-large
  • Hugging Face Repo: Link to Model Repository
  • License: MIT (or any applicable license you choose)
  • Author: Bayram Eker
  • Date: 2024-12-21

Training Details

  • Model: ModernBERT-large
  • Framework: PyTorch
  • Training Time: Approximately 50 minutes (4 epochs)
  • Batch Size: 64
  • Learning Rate: 8e-5
  • Optimizer: AdamW
  • Mixed Precision: bf16 for A100 GPU

Acknowledgments

  • The model was trained on the winvoker/turkish-sentiment-analysis-dataset dataset.
  • Special thanks to the Hugging Face community and the contributors to the transformers library.
  • Thanks to all contributors of the dataset and pretrained models.

Future Work

  • Expand the model with more complex sentiment labels (e.g., multi-class sentiments, aspect-based sentiment analysis).
  • Fine-tune the model on a larger, more diverse dataset for better generalization across various domains.

License

This model is licensed under the MIT License. See the LICENSE file for more details.

Downloads last month
44
Safetensors
Model size
396M params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for bayrameker/Turkish-sentiment-ModernBERT-large

Finetuned
(41)
this model

Dataset used to train bayrameker/Turkish-sentiment-ModernBERT-large