Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
wassemgtk 
posted an update 4 days ago
Post
1966
For fun, a new project: SuperTokenizer! A BPE tokenizer trained on C4 to beat GPT-4. Byte-level, A100-powered, and open-source. Messing around with tokens!
https://github.com/wassemgtk/SuperTokenizer

Sounds interesting, I’ll check it out!