Published a new blogpost π In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer. π https://huggingface.co/blog/not-lain/tensor-dims some interesting takeaways :
Why do we only get to post once every 24 hours? I've been waiting *so long*. Anyway, now that the wait is finally over, I have some very important information to share.
Can we please do something about this? It makes everything I do so much harder, and because my local machine is so terrible, I am forced to test in production. This makes debugging so difficult. nroggendorff/system-exit