bert-uncased-L8-H128-A2

This is one of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) released by google-research/bert.

These BERT models was released as TensorFlow checkpoints, however, this is the converted version to PyTorch. More information can be found in google-research/bert or lyeoni/convert-tf-to-pytorch.

Evaluation

Here are the evaluation scores (F1/Accuracy) for the MPRC task.

Model	MRPC
BERT-Tiny	81.22/68.38
BERT-Mini	81.43/69.36
BERT-Small	81.41/70.34
BERT-Medium	83.33/73.53
BERT-Base	85.62/78.19

References

@article{turc2019,
  title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
  author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1908.08962v2 },
  year={2019}
}