Very slow on T4 instance

by oddball516 - opened Mar 2

Mar 2

•

I just tried this fp8 model on a T4 instance, it loads but training runs very very slow.

steps:   1%|█                    | 7/800 [03:17<6:11:57, 28.14s/it, avr_loss=0.305]

Is that normal?

rockerBOO

Owner Mar 2

T4 doesn't support bf16, so if you use bf16 or bf16 mixed precision which is required as fp16 produces NaN. But if you set it to bf16 it will convert it to FP32 every time it does a calculation. Use L4 which supports bf16.

oddball516

Mar 3

Thanks, the fp8 model worked with L4, ETA is 50 minutes this time.

oddball516 changed discussion status to closed Mar 3

oddball516

Mar 3

@rockerBOO I did another test on L40S, the fp8 and fp16 model have similar completion time, 17 min vs 18 min, is that normal? Should I expect a performance boost on the fp8 version?

oddball516 changed discussion status to open Mar 3

rockerBOO

Owner Mar 3

Depends if you are using mixed precision, as usually you'd be coming from fp32 and mixed precision to do it in bf16 or fp16 so a performance increase on the calculations. But with fp8 and doing the calculations at bf16, you're doing it at a higher precision. Would need to do mixed precision at fp8 which is a little more involved and requires third party libraries to do so.

oddball516

Mar 20

@rockerBOO Have you tried to run flux fp8 on comfyui?

On L40 I tried to run comfyui with flux fp8 version, but it still tries to cast to fp16, any ideas?

model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16

I already have added --fp8_e4m3fn-unet to comfyui cli args, but it still tries to cast it.

rockerBOO

Owner Mar 20

well its casting it to bfloat16 so thats fine. The casting is related to when it does the calculations as doing them in fp8 can be problematic unless you have some other libraries that support it better.

oddball516

Mar 20

@rockerBOO But it took a while to finish, is it possible to avoid it?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment