Hierarchical BERT
Collection
Set of BERT models with Hierarchical attention pre-trained on conversational data to process multiple utterances at once
•
8 items
•
Updated
This model is a fine-tuned version of HierBert on an English version of OpenSubtitles dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
2.9488 | 1.55 | 25000 | 2.7667 | 0.4935 |
2.4233 | 3.1 | 50000 | 2.2922 | 0.5612 |