HugoVoxx's picture
Upload 12 files
8758510 verified
raw
history blame
87 Bytes
# Use 26 layers, for comparison against tall recurrent transformers.
NUM_LAYERS = 26