Qwen3-14B-Griffon / README.md
Daemontatox's picture
Update README.md
f96f955 verified
metadata
base_model: unsloth/qwen3-14b-unsloth
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - qwen3
  - trl
  - reasoning
  - math
  - code-generation
license: apache-2.0
language:
  - en
datasets:
  - open-thoughts/OpenThoughts2-1M
library_name: transformers

image

Qwen3-14B-Griffon

Developed by: Daemontatox
License: Apache-2.0
Finetuned from: unsloth/qwen3-14b-unsloth

Model Overview

This is a fine-tuned version of the Qwen3-14B model using the high-quality OpenThoughts2-1M dataset. Fine-tuned with Unsloth’s TRL-compatible framework and LoRA for efficient performance, this model is optimized for advanced reasoning tasks, especially in math, logic puzzles, code generation, and step-by-step problem solving.

Training Dataset

  • Dataset: OpenThoughts2-1M
  • Source: A synthetic dataset curated and expanded by the OpenThoughts team
  • Volume: ~1.1M high-quality examples
  • Content Type: Multi-turn reasoning, math proofs, algorithmic code generation, logical deduction, and structured conversations
  • Tools Used: Curator Viewer

This dataset builds upon OpenThoughts-114k and integrates strong reasoning-centric data sources like OpenR1-Math and KodCode.

Intended Use

This model is particularly suited for:

  • Chain-of-thought and step-by-step reasoning
  • Code generation with logical structure
  • Educational tools for math and programming
  • AI agents requiring multi-turn problem-solving

Limitations

  • English-only focus (does not generalize well to other languages)
  • May hallucinate factual content despite reasoning depth
  • Inherits possible biases from synthetic pretraining data

Example Usage

# Use a pipeline as a high-level helper
from transformers import pipeline

messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="Daemontatox/Qwen3_14B_Griffon")
pipe(messages)

Training Details

Framework: TRL + LoRA with Unsloth acceleration

Epochs/Steps: Custom fine-tuning on ~1M samples

Hardware: Single-node A100 80GB / similar high-VRAM setup

Objective: Enhance multi-domain reasoning under compute-efficient constraints