bakhitovd
/

led-base-7168-ml

text2text-generation

Model card Files Files and versions Community

bakhitovd commited on May 10, 2023

Commit

c924b9c

·

1 Parent(s): 798ac2c

Update README.md

Files changed (1) hide show

README.md +37 -1

README.md CHANGED Viewed

@@ -3,4 +3,40 @@ datasets:
 - bakhitovd/data_science_arxiv
 metrics:
 - rouge
----

 - bakhitovd/data_science_arxiv
 metrics:
 - rouge
+---
+# Fine-tuned Longformer for Summarization of Machine Learning Articles
+## Model Details
+- GitHub: https://github.com/Bakhitovd/MS_in_Data_Science_Capstone
+- Model name: bakhitovd/led-base-16384-data-science
+- Model type: Longformer (alenai/led-base-16384)
+- Model description: This Longformer model has been fine-tuned on a focused subset of the arXiv part of the scientific papers dataset, specifically targeting articles about Machine Learning. It aims to generate accurate and consistent summaries of machine learning research papers.
+## Intended Use
+This model is intended to be used for text summarization tasks, specifically for summarizing machine learning research papers.
+## How to Use
+~~~
+import torch
+from transformers import LEDTokenizer, LEDForConditionalGeneration
+tokenizer = LEDTokenizer.from_pretrained("bakhitovd/led-base-16384-data-science")
+model = LEDForConditionalGeneration.from_pretrained("bakhitovd/led-base-16384-data-science")
+~~~
+## Use the model for summarization
+~~~
+article = "... long document ..."
+inputs_dict = tokenizer.encode(article, padding="max_length", max_length=16384, return_tensors="pt", truncation=True)
+input_ids = inputs_dict.input_ids.to("cuda")
+attention_mask = inputs_dict.attention_mask.to("cuda")
+global_attention_mask = torch.zeros_like(attention_mask)
+global_attention_mask[:, 0] = 1
+predicted_abstract_ids = model.generate(input_ids, attention_mask=attention_mask, global_attention_mask=global_attention_mask, max_length=512)
+summary = tokenizer.decode(predicted_abstract_ids, skip_special_tokens=True)
+print(summary)
+~~~
+## Training Data
+Dataset name: bakhitovd/data_science_arxiv
+This dataset is a subset of the 'Scientific papers' dataset, which contains articles semantically, structurally, and meaningfully closest to articles describing machine learning. This subset was obtained using K-means clustering on the embeddings generated by SciBERT.
+## Evaluation Results
+The model's performance was evaluated using ROUGE metrics and it showed improved performance over the baseline models.
+![image.png](https://s3.amazonaws.com/moonup/production/uploads/63fb9a520aa18292d5c1027a/19mfKrjHkiCFDAL557Vsu.png)