Edit model card

Built with Axolotl

Qwen2.5-0.5B Fine-tuned on Synthia v1.5-I

This model is a fine-tuned version of Qwen/Qwen2.5-0.5B on the Synthia v1.5-I dataset, which contains over 20.7k instruction-following examples.

Model Description

Qwen2.5-0.5B is part of the latest Qwen2.5 series of large language models. The base model brings significant improvements in:

  • Instruction following and generating long texts
  • Understanding structured data and generating structured outputs
  • Support for over 29 languages
  • Long context support up to 32,768 tokens

This fine-tuned version enhances the base model's instruction-following capabilities through training on the Synthia v1.5-I dataset.

Model Architecture

  • Type: Causal Language Model
  • Parameters: 0.49B (0.36B non-embedding)
  • Layers: 24
  • Attention Heads: 14 for Q and 2 for KV (GQA)
  • Context Length: 32,768 tokens
  • Training Framework: Transformers 4.45.0.dev0

Intended Uses & Limitations

This model is intended for:

  • Instruction following and task completion
  • Text generation and completion
  • Conversational AI applications

The model inherits the multilingual capabilities and long context support of the base Qwen2.5-0.5B model, while being specifically tuned for instruction following.

Training Procedure

Training Data

The model was fine-tuned on the Synthia v1.5-I dataset containing 20.7k instruction-following examples.

Training Hyperparameters

The following hyperparameters were used during training:

  • Learning rate: 1e-05
  • Train batch size: 5
  • Eval batch size: 5
  • Seed: 42
  • Gradient accumulation steps: 8
  • Total train batch size: 40
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • LR scheduler type: cosine
  • LR scheduler warmup steps: 100
  • Number of epochs: 3
  • Sequence length: 4096
  • Sample packing: enabled
  • Pad to sequence length: enabled

Framework Versions

  • Transformers 4.45.0.dev0
  • Pytorch 2.3.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
See axolotl config

axolotl version: 0.4.1

Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for artificialguybr/QWEN-2.5-0.5B-Synthia-I

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(29)
this model
Quantizations
3 models