Questions about “Minibatch Optimal Transport”

#10

by zyx1213271098 - opened Mar 25

Mar 25

I don't understand the principle of Minibatch Optimal Transport. Can you explain it in more detail? Why is a smaller distance more advantageous for model training? What impact does this have on the inference performance?

lodestones

Owner Mar 26

•

edited Mar 26

yeah basically for every training batch you compute the optimal transport pairings between noise and image.

it has faster convergence because the model has more certainty when regressing on the vector to learn the expectation value

you can see here, mnist for just 1 epoch almost converged
https://x.com/LodestoneE621/status/1893408571448049685

cifar10 RF vs OT-RF

basically leaning expectation value from this is harder

than this

lodestones

Owner Mar 26

you can see the majority of the flow path is straighter too
so it can reduce inference steps quite a bit (not as significant as reflowing it again tho)

bghira

Apr 27

yeah but flow matching is diffusion, there is no difference. everything uses flow matching just for convenient unification of terms but you can parameterise flux loss in terms of eps if you really wanted.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment