Training with transformers API

#75
by Padajno - opened

Hi,
in my experiments, I was able to train ModernBERT by relying on the Transformers library (specifically, using a modified run_mlm.py script from the transformers github: https://github.com/huggingface/transformers/blob/v4.49.0/examples/pytorch/language-modeling/run_mlm.py).
But if I understand correctly, this approach does not utilize the local/global attention ModernBERT implements?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment