Training with transformers API

#75

by Padajno - opened Mar 17

Mar 17

Hi,
in my experiments, I was able to train ModernBERT by relying on the Transformers library (specifically, using a modified run_mlm.py script from the transformers github: https://github.com/huggingface/transformers/blob/v4.49.0/examples/pytorch/language-modeling/run_mlm.py).
But if I understand correctly, this approach does not utilize the local/global attention ModernBERT implements?

wilfoderek

Mar 27

Hey! could you share your colab ?

wilfoderek

Mar 27

Hi,
in my experiments, I was able to train ModernBERT by relying on the Transformers library (specifically, using a modified run_mlm.py script from the transformers github: https://github.com/huggingface/transformers/blob/v4.49.0/examples/pytorch/language-modeling/run_mlm.py).
But if I understand correctly, this approach does not utilize the local/global attention ModernBERT implements?

It uses RoPE.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment