Uploaded Model

Overview

This model is a Llama 3.2 3B Instruct variant that has been specifically fine-tuned using reasoning data from Claude 3.7 Sonnet. The goal was to leverage Claude's renowned reasoning capabilities within a more accessible, open-source architecture like Llama.

Technical Details

  • Developed by: reedmayhew
  • Base Model: unsloth/llama-3.2-3b-instruct
  • Finetuning Method: Supervised Fine-Tuning (SFT) using LoRA
  • Training Speed Enhancement: Trained 2x faster with Unsloth and Huggingface's TRL library

Training Data

The model was fine-tuned on a dataset derived from:

  • reedmayhew/claude-3.7-sonnet-reasoning

This allows the model to potentially exhibit improved logical thinking, problem-solving abilities, and complex reasoning compared to the base Llama 3.2 model while remaining open-source.

Usage Notes

While this model inherits some of Claude's strenghs in reasoning, it is still a derivative work built on Llama architecture. Users should evaluate its performance carefully for their specific use cases.

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
1,204
GGUF
Model size
3.21B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train reedmayhew/Llama-3.2-3B-claude-3.7-sonnet-reasoning-distilled