BERTopic Model for CNN News Articles

This model is a BERTopic model fine-tuned on CNN news articles. It uses the sentence transformer model "all-MiniLM-L6-v2" to encode the sentences and UMAP for dimensionality reduction.

Usage

First, install the required packages:

pip install sentence_transformers umap-learn bertopic

Then, load the model and encode your documents:

```python
from sentence_transformers import SentenceTransformer
from umap import UMAP
from bertopic import BERTopic

# Load the sentence transformer model
sentence_model = SentenceTransformer("all-MiniLM-L6-v2")

# Set the random state in the UMAP model to prevent stochastic behavior 
umap_model = UMAP(n_neighbors=15, n_components=5,  min_dist=0.0, metric='cosine', random_state=42)

# Load the BERTopic model
my_model = BERTopic.load("from/path/model.bin")

# Encode your documents
document_embeddings = sentence_model.encode(documents)

predict :


sentences = "my sentence"

embeddings = sentence_model.encode([sentences])

topic , _ =my_model.transform([sentences],embeddings)

For more information on how to use the BERTopic model, see the (BERTopic documentation)[https://maartengr.github.io/BERTopic/index.html].

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train AyoubChLin/bertopic_cnn_news