File size: 2,245 Bytes
798ac2c 60a1db3 c924b9c ac3ab1f 60a1db3 c924b9c ac3ab1f c924b9c 60a1db3 c924b9c ac3ab1f c924b9c 60a1db3 c924b9c 59c8196 c924b9c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
---
datasets:
- bakhitovd/data_science_arxiv
metrics:
- rouge
license: cc0-1.0
pipeline_tag: summarization
---
# Fine-tuned Longformer for Summarization of Machine Learning Articles
## Model Details
- GitHub: https://github.com/Bakhitovd/led-base-7168-ml
- Model name: bakhitovd/led-base-7168-ml
- Model type: Longformer (alenai/led-base-16384)
- Model description: This Longformer model has been fine-tuned on a focused subset of the arXiv part of the scientific papers dataset, specifically targeting articles about Machine Learning. It aims to generate accurate and consistent summaries of machine learning research papers.
## Intended Use
This model is intended to be used for text summarization tasks, specifically for summarizing machine learning research papers.
## How to Use
```python
import torch
from transformers import LEDTokenizer, LEDForConditionalGeneration
tokenizer = LEDTokenizer.from_pretrained("bakhitovd/led-base-7168-ml")
model = LEDForConditionalGeneration.from_pretrained("bakhitovd/led-base-7168-ml")
```
## Use the model for summarization
```python
article = "... long document ..."
inputs_dict = tokenizer.encode(article, padding="max_length", max_length=16384, return_tensors="pt", truncation=True)
input_ids = inputs_dict.input_ids.to("cuda")
attention_mask = inputs_dict.attention_mask.to("cuda")
global_attention_mask = torch.zeros_like(attention_mask)
global_attention_mask[:, 0] = 1
predicted_abstract_ids = model.generate(input_ids, attention_mask=attention_mask, global_attention_mask=global_attention_mask, max_length=512)
summary = tokenizer.decode(predicted_abstract_ids, skip_special_tokens=True)
print(summary)
```
## Training Data
Dataset name: bakhitovd/data_science_arxiv\
This dataset is a subset of the 'Scientific papers' dataset, which contains articles semantically, structurally, and meaningfully closest to articles describing machine learning. This subset was obtained using K-means clustering on the embeddings generated by SciBERT.
## Evaluation Results
The model's performance was evaluated using ROUGE metrics and it showed improved performance over the baseline models.
 |