Do transformer to get only last hidden state for downstream task will transformer be train the model?eg: I use graphormerModel from higgingfaceEverytime, I run the same input I get different outcome. Do it normal?
· Sign up or log in to comment