fairseq-dense-355M bugged?
#1
by
MaxLohMusic
- opened
In my testing it seems to output weirder outputs than other smaller models such as 125M, no matter what temperature/repetition penalty or other params I set.
In Goose AI's version of Fairseq-355M, the outputs are a lot more coherent and better than the 125M models.