Results

Sample Invoice

models

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct on the News dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2032

Model description

The primary objective of the Qwen2.5-1.5B-Instruct model, which has been fine-tuned, is to automatically extract and summarize critical information from Arabic text inputs, such as news articles, generating structured JSON-like outputs.

Training and evaluation data

Fine-Tuning Dataset: 2001 samples of Arabic technology-related text, used to adapt the model for structured extraction tasks.

Evaluation Dataset: 100 samples of Arabic sports-related text, used to assess performance on a different domain.

Average similarity scores on the evaluation data

  • This similarity measure is applied only to one field from the output JSON, namely the "News_title," and the data belongs to a different domain than the one the model was fine-tuned on.
  • Mean similarity: 0.6871507857739926
  • Note: The average similarity score is quite good, considering that the Qwen model was fine-tuned on a tech dataset. However, since all the test data here is related to sports, testing on data more similar to the tech domain would likely yield even better accuracy.

Training procedure

Since the dataset was not labeled, Llama 4 Scout was employed as a teacher model in a knowledge distillation framework to generate pseudo-labels or guide the training of Qwen2.5-1.5B-Instruct (the student model). Knowledge distillation transfers knowledge from a larger, more capable model (Llama 4 Scout) to a smaller, efficient model (Qwen2.5-1.5B-Instruct).

  • Role of Llama 4 Scout: Teacher Model: Llama 4 Scout, a powerful language model, was used to process the unlabeled 2001 technology samples and generate high-quality structured outputs (e.g., pseudo-labels for story titles, keywords, summaries, categories, and entities). Output Generation: For each input text, Llama 4 Scout produced: Story Title: A concise headline summarizing the main event. Keywords: Relevant terms extracted based on contextual understanding. Summary: A set of key sentences or abstractive summary points. Category: A predicted category (e.g., “technology” for training data). Entities: Identified entities with types (e.g., person, organization), using its advanced NER capabilities.

  • Tool: LLaMA-Factory used for streamlined fine-tuning, supporting LoRA (Low-Rank Adaptation).

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
0.2435 0.2061 100 0.2322
0.2341 0.4122 200 0.2187
0.2136 0.6182 300 0.2057
0.2021 0.8243 400 0.1994
0.1384 1.0309 500 0.1992
0.1487 1.2370 600 0.1972
0.1437 1.4431 700 0.1935
0.1371 1.6491 800 0.1927
0.147 1.8552 900 0.1883
0.0668 2.0618 1000 0.1961
0.077 2.2679 1100 0.2072
0.0707 2.4740 1200 0.2032
0.059 2.6801 1300 0.2037
0.0657 2.8861 1400 0.2032

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Alawy21/News_analyzer_Qwen2.5_1.5B_Finetuning

Base model

Qwen/Qwen2.5-1.5B
Adapter
(420)
this model