fhswf
/

BPE_GPT2_TinyStoriesV2_cleaned_1024

text generation

Model card Files Files and versions Community

BPE_GPT2_TinyStoriesV2_cleaned

BPE Tokenizer Model for dataset 'fhswf/TinyStoriesV2_cleaned'

Based on get-neo BPE Tokenizer, but with a smaller vocabulary. Trained with TinyStoriesV2.

Vocab Size: 1024
256 Base chars
1 extra Token: <|endoftext|>
767 merges

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train fhswf/BPE_GPT2_TinyStoriesV2_cleaned_1024