Training with transformers API
#75
by
Padajno
- opened
Hi,
in my experiments, I was able to train ModernBERT by relying on the Transformers library (specifically, using a modified run_mlm.py script from the transformers github: https://github.com/huggingface/transformers/blob/v4.49.0/examples/pytorch/language-modeling/run_mlm.py).
But if I understand correctly, this approach does not utilize the local/global attention ModernBERT implements?