Flan-T5 News Article Generator

This project fine-tunes a Flan-T5 model to generate news articles in Dhivehi language based on titles.

Evaluation

The model was evaluated using ROUGE metrics on a validation set. Here are the final evaluation results:

  • Loss: 0.154
  • ROUGE-1: 0.170
  • ROUGE-2: 0.135
  • ROUGE-L: 0.169
  • ROUGE-Lsum: 0.169

The training metrics at epoch 4.35:

  • Training loss: 0.857
  • Gradient norm: 0.202
  • Learning rate: 0.000287

The model shows reasonable performance on the validation set with ROUGE scores indicating decent overlap between generated and reference texts. The relatively low validation loss of 0.154 compared to training loss suggests the model is generalizing well without overfitting.

Usage


from transformers import T5ForConditionalGeneration, T5Tokenizer

# Load your finetuned model
model = T5ForConditionalGeneration.from_pretrained("alakxender/flan-t5-corpora-mixed")
tokenizer = T5Tokenizer.from_pretrained("alakxender/flan-t5-corpora-mixed")

def generate_text(prompt, max_new_tokens=150, num_beams=1, repetition_penalty=1.2, no_repeat_ngram_size=1, do_sample=True):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        num_beams=num_beams,
        repetition_penalty=repetition_penalty,
        no_repeat_ngram_size=no_repeat_ngram_size,
        do_sample=do_sample,
        early_stopping=True # this flag is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`
    )
    output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    # Trim to the last period
    if '.' in output_text:
        last_period = output_text.rfind('.')
        output_text = output_text[:last_period+1]
    return output_text

prompt = "ބައެއް ފިހާރަތަކުގައި އަދިވެސް ބިދޭސީ ސޭޓުން!"
output = generate_text(f"Create an article about: {prompt}")
print(output)
Downloads last month
25
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alakxender/flan-t5-corpora-mixed

Finetuned
(811)
this model

Dataset used to train alakxender/flan-t5-corpora-mixed

Space using alakxender/flan-t5-corpora-mixed 1

Collection including alakxender/flan-t5-corpora-mixed