GPT-2 (From Scratch in PyTorch) β Fine-Tuned Version
This model is a custom GPT-2 implementation built entirely from scratch in PyTorch (no Hugging Face Transformers for the architecture itself) and fine-tuned on a custom dataset using Supervised Fine-Tuning (SFT).
Model Details
- Architecture: GPT-2 (from scratch)
- Variants Supported: gpt2-small, gpt2-medium, gpt2-large, gpt2-xl
- Framework: PyTorch
- Pretraining Source: Loaded GPT-2 pretrained weights from OpenAI format
- Fine-Tuning Method: Supervised Fine-Tuning (SFT)
- Fine-Tuning Data: Custom dataset (domain-specific; see dataset section)
- Tokenization: GPT-2 tokenizer style
Repo
Model tree for Himanshu13x/gpt2-medium355M-sft
Base model
openai-community/gpt2-medium