google/electra-small-generator
Fill-Mask
•
Updated
•
181k
•
12
This collection regroups the ELECTRA models released by the Google team.
Note Smallest generator model. 12 Layers, 1024 intermediate size, 256 hidden size, 4 attention heads.
Note Smallest discriminator model. 12 Layers, 1024 intermediate size, 256 hidden size, 4 attention heads.
Note Base generator model. 12 Layers, 1024 intermediate size, 256 hidden size, 4 attention heads.
Note Base discriminator model. 12 Layers, 3072 intermediate size, 768 hidden size, 12 attention heads.
Note Largest generator model. 24 Layers, 1024 intermediate size, 1024 hidden size, 4 attention heads.
Note Largest discriminator model. 24 Layers, 4096 intermediate size, 1024 hidden size, 16 attention heads.