Typo in Sequence Parallelism TO -> TP

#106
by JulienVig - opened

Thank you so much for the great blogpost.

In "Here again, like vanilla TO, TP+SP is usually done only within a node (keeping the TP degree under the number of GPU per nodes", I assume "TO" is supposed to be "TP".

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment