can we have some official training / finetuning recipes for this model ?

#11

by StephennFernandes - opened 3 days ago

3 days ago

•

hi on the latest version of transformers i tried to finetune mmBERT on the text classification tasks:
https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-classification

when i tired to use mmBERT as a drop in replacement over the original uncasedBERT, even after several epochs the accuracy is stuck to 0.3 and f1 score is always 0 .

seems like the mmBERT models are not directly compatible with all BERT finetuning techniques at the moment.

would really approeciate if we could get some training / finetuning guidelines and examples so we could use mmBERT in all possible ways we used mBERT or BERT before.

orionweller

Center for Language and Speech Processing @ JHU org 3 days ago

The evaluations were done with (a slightly older version of) this script and others have already fine-tuned it with the example scripts, so it does work with the right environment. Perhaps it is an issue with the attention function, as I had flash attention installed? I know some ModernBERT models had issues with the backup attention function (sdpa attention) in the past though I thought it was resolved. Try something like pip install "flash_attn==2.6.3" --no-build-isolation or similar and see if it changes it.

StephennFernandes

1 day ago

@orionweller thanks for responding back. it really means a lot.

Wanted to know how could i continually pretrain the mmBERT model further on more custom pretraining data. are there any resources for this. how do you recommend is the most stable and performant way to further continually pretrain the mmBERT model ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment