Should We Still Pretrain Encoders with Masked Language Modeling? Paper • 2507.00994 • Published 28 days ago • 74
ConTEB evaluation datasets Collection Evaluation datasets of the ConTEB benchmark. Use "test" split where available, otherwise "validation", otherwise "train". • 8 items • Updated Jun 2 • 1
ConTEB training datasets Collection Training data for the InSeNT method. • 3 items • Updated Jun 2 • 1
ConTEB models Collection Our models trained with the InSeNT approach. These are the checkpoints that we used to run the evaluations reported in our paper. • 2 items • Updated Jun 2 • 1