🧠 Sentiment Analysis with Logistic Regression

This model performs multi-class sentiment analysis on tweets, classifying them into the following categories:

  • Positive
  • Negative
  • Neutral
  • Irrelevant

It uses a custom preprocessing pipeline with:

  • CountVectorizer
  • TF-IDF transformation
  • Logistic Regression classifier (max_iter=1000)

πŸ— Model Architecture

  • CountVectorizer: Converts tweets into token count vectors.
  • TfidfTransformer: Reweights tokens by importance.
  • LogisticRegression: Interpretable and robust classification baseline.

πŸ§ͺ Evaluation

Evaluated on a separate validation set of 999 tweets:

Class Precision Recall F1-score
Irrelevant 0.88 0.85 0.87
Negative 0.87 0.94 0.91
Neutral 0.97 0.86 0.91
Positive 0.89 0.94 0.91
Overall Accuracy 0.90

πŸ“¦ Usage

python
import joblib

model = joblib.load("sentiment_model_lr.pkl")
user_input = "This update is surprisingly good!"

prediction = model.predict([user_input])
print(prediction[0])  # β†’ Positive, Negative, etc.

> ⚠️ Requires scikit-learn 1.6.1+ to avoid version mismatch warnings.


πŸ“š Dataset

Tweets were preprocessed using a clean_text routine and labeled into
the four sentiment categories. If you’d like to experiment or re-train, contact
the author or fork this repo.

πŸ§‘β€πŸ’» Author

Built by @arshvir Model version: 1.0 License: MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support