Uploaded Model

Overview

This model is a Llama 3.1 8B Instruct variant that has been specifically fine-tuned using reasoning data from Claude 3.7 Sonnet. The goal was to leverage Claude's renowned reasoning capabilities within a more accessible, open-source architecture like Llama.

Technical Details

Developed by: reedmayhew
Base Model: unsloth/llama-3.1-8b-instruct-bnb-4bit
Finetuning Method: Supervised Fine-Tuning (SFT) using LoRA
Training Speed Enhancement: Trained 2x faster with Unsloth and Huggingface's TRL library

Training Data

The model was fine-tuned on a dataset derived from:

reedmayhew/claude-3.7-sonnet-reasoning

This allows the model to potentially exhibit improved logical thinking, problem-solving abilities, and complex reasoning compared to the base Llama 3.1 model while remaining open-source.

Usage Notes

While this model inherits some of Claude's strengths in reasoning, it is still a derivative work built on Llama architecture. Users should evaluate its performance carefully for their specific use cases.

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

reedmayhew
/

Llama-3.1-8B-claude-3.7-sonnet-reasoning-distilled

Uploaded Model

Overview

Technical Details

Training Data

Usage Notes

Dataset used to train reedmayhew/Llama-3.1-8B-claude-3.7-sonnet-reasoning-distilled