Hungarian Abstractive Summarization BART model
For further models, scripts and details, see our repository or our demo site.
- BART base model (see Results Table - bold):
- Pretrained on Webcorpus 2.0
 - Finetuned NOL corpus (nol.hu)
- Segments: 397,343
 
 
 
Limitations
- tokenized input text (tokenizer: HuSpaCy)
 - max_source_length = 512
 - max_target_length = 256
 
Results
| Model | HI | NOL | 
|---|---|---|
| BART-base-512 | 30.18/13.86/22.92 | 46.48/32.40/39.45 | 
| BART-base-1024 | 31.86/14.59/23.79 | 47.01/32.91/39.97 | 
Citation
If you use this model, please cite the following paper:
@inproceedings {yang-bart,
    title = {{BARTerezzünk! - Messze, messze, messze a világtól, - BART kísérleti modellek magyar nyelvre}},
    booktitle = {XVIII. Magyar Számítógépes Nyelvészeti Konferencia},
    year = {2022},
    publisher = {Szegedi Tudományegyetem, Informatikai Intézet},
    address = {Szeged, Magyarország},
    author = {Yang, Zijian Győző},
    pages = {15--29}
}
- Downloads last month
 - 4