π§ Sentiment Analysis with Logistic Regression
This model performs multi-class sentiment analysis on tweets, classifying them into the following categories:
- Positive
- Negative
- Neutral
- Irrelevant
It uses a custom preprocessing pipeline with:
- CountVectorizer
- TF-IDF transformation
- Logistic Regression classifier (
max_iter=1000
)
π Model Architecture
- CountVectorizer: Converts tweets into token count vectors.
- TfidfTransformer: Reweights tokens by importance.
- LogisticRegression: Interpretable and robust classification baseline.
π§ͺ Evaluation
Evaluated on a separate validation set of 999 tweets:
Class | Precision | Recall | F1-score |
---|---|---|---|
Irrelevant | 0.88 | 0.85 | 0.87 |
Negative | 0.87 | 0.94 | 0.91 |
Neutral | 0.97 | 0.86 | 0.91 |
Positive | 0.89 | 0.94 | 0.91 |
Overall Accuracy | 0.90 |
π¦ Usage
python
import joblib
model = joblib.load("sentiment_model_lr.pkl")
user_input = "This update is surprisingly good!"
prediction = model.predict([user_input])
print(prediction[0]) # β Positive, Negative, etc.
> β οΈ Requires scikit-learn 1.6.1+ to avoid version mismatch warnings.
π Dataset
Tweets were preprocessed using a clean_text routine and labeled into
the four sentiment categories. If youβd like to experiment or re-train, contact
the author or fork this repo.
π§βπ» Author
Built by @arshvir Model version: 1.0 License: MIT
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support