GPT-2 (From Scratch in PyTorch) β€” Fine-Tuned Version

This model is a custom GPT-2 implementation built entirely from scratch in PyTorch (no Hugging Face Transformers for the architecture itself) and fine-tuned on a custom dataset using Supervised Fine-Tuning (SFT).

Model Details

  • Architecture: GPT-2 (from scratch)
  • Variants Supported: gpt2-small, gpt2-medium, gpt2-large, gpt2-xl
  • Framework: PyTorch
  • Pretraining Source: Loaded GPT-2 pretrained weights from OpenAI format
  • Fine-Tuning Method: Supervised Fine-Tuning (SFT)
  • Fine-Tuning Data: Custom dataset (domain-specific; see dataset section)
  • Tokenization: GPT-2 tokenizer style

Repo

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Himanshu13x/gpt2-medium355M-sft

Finetuned
(130)
this model