File size: 1,983 Bytes
930eca4 3fb33ba 930eca4 79b054c 930eca4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
---
license: other
language:
- en
tags:
- facebook
- meta-pytorch
- blt
---
# Byte Latent Transformer (BLT)
This repository contains the model weights for the Meta paper: "Byte Latent Transformer: Patches Scale Better Than Tokens" under the same non-commercial/research only license.
All credits to original authors/Meta.
- [Paper Link](https://dl.fbaipublicfiles.com/blt/BLT__Patches_Scale_Better_Than_Tokens.pdf)
- [HF Paper Link](https://huggingface.co/papers/2412.09871)
## Abstract
We introduce the Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that
for the first time, matches tokenization-based LLM performance at scale, with significant improvements
in inference efficiency and robustness. BLT encodes bytes into dynamically sized patches, which serve
as the primary units of computation. Patches are segmented dynamically based on the entropy of the
next byte, allocating more compute and model capacity where there is more data complexity. The BLT
architecture includes new attention mechanisms to maximize the information flow between byte and
patch hidden representations and a new type of byte-sequence memory. We present the first scaling
study of byte-level models up to 8B parameters and 8T training bytes, showing for the first time
that we can train a model end-to-end at scale from bytes with no tokenization or other preprocessing.
Scaling trends reveal training and inference efficiency benefits from dynamically selecting very long
patches on average, along with qualitative improvements with reasoning and long tail generalization
from modeling byte-sequences.
To run the model, see the readme here: https://github.com/facebookresearch/blt
## Links
- Code: https://github.com/facebookresearch/blt
- BLT 1B Weights: https://huggingface.co/facebook/blt-1b
- BLT 7B Weights: https://huggingface.co/facebook/blt-7b
- BLT Weight Collection: https://huggingface.co/collections/facebook/blt-6801263d4ac1704702a192a6
|