Can EuroBert be used as generative model as well?

#18
by gabriead - opened

I have read the paper understanding that it is an encoder model only, so no capabilities of generation text. In the end the authors state: "we incorporate recent architectural advances from decoder models ..." but that doesn't imply that there is a decoder layer, is that correct?

EuroBERT org

Hey @gabriead , EuroBERT cannot be used as a decoder model, or at least not without an extensive training phase (which is likely not very practical or relevant in your case). What we did was take all the recent architectural advances from decoder models but apply a non-causal attention mask during attention, which is one of the key differences between encoders and decoders. We understand that this formulation might confuse readers, so we have revised it in the next version of the paper, which should be released soon.

Sorry for the confusion, and I hope you enjoy using EuroBERT!

Nicolas-BZRD changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment