About random factor in the tokenization/embedding process

#27
by josecar24 - opened

I'm wondering if other users are also experiencing this situation: when I try to convert the same DNA sequence into a tensor multiple times using DNABERT2, the hidden state converted several times in a row comes out with different values. Is this because there is a random factor in the conversion process? I wonder how I can find it and make sure that the tensor I convert from the same DNA sequence no longer varies randomly?
For instance, from the same DNA sequence "GGCTGCGCGTACATGGGTCGT", I ran DNABERT2 twice and the output tensors below:

Tensor_1.png

Tensor_2.png

josecar24 changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment