custom_text_proj initialization

by edesalve - opened Feb 20

Feb 20

Hi all, I've been working with the weights for custom_text_proj were not being loaded from the checkpoint and instead got initialized randomly, leading to very different results at each model instantiation.

To address this, I solved the problem by leveraging a base model provided by the Vidore team (https://huggingface.co/vidore/colqwen2.5-base).

Do you recommend this solution, or is there a better alternative or a forthcoming update to ensure proper initialization for the projection layer within your model?

Thank you!

Markgazol

Metric org Feb 21

•

edited Feb 22

Hi @edesalve , thanks for the question.

We ran multiple evaluations previously, and the results were consistent across runs. I checked the initialization of the custom_text_proj layer, and for various runs, the standard deviation, max, and min values remained the same. The only difference was in the mean, which had a very small variation close to zero (e.g., 4.3869e-05, 6.5327e-05). This minor variation in the mean does not adversely affect model performance.

Markgazol

Metric org Feb 22

@edesalve hi again,
I have updated configs and added https://huggingface.co/Metric-AI/colqwen2.5-base with the corresponding weights for custom_text_proj . Now you can use the model and get deterministic scores always :)

Markgazol changed discussion status to closed Feb 25

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment