Multi-gpu usage

by modjo-ai - opened Jan 22

Jan 22

On my 24gb gpu I can only process audio files less than 10 minutes. Is it possible to run it on multiple, e.g. 4 gpus to not have CUDA OOM issues?

NVIDIA org Jan 28

Sorry this is not possible. But we will soon release a streaming version which will have no limits on the audio length.

modjo-ai changed discussion status to closed Jan 29

Mar 30

@imedennikov , any update on releasing the streaming model?

NVIDIA org Mar 31

@leminhnguyen this work is still in progress. We hope to finalize it in Q2'25.

Apr 10

Hi @imedennikov , one more question: Is the streaming model just a code change, or do we need to re-train the sortformer model to incorporate the new streaming mechanism? For example cache-aware streaming ASR (https://github.com/NVIDIA/NeMo/blob/main/tutorials/asr/Online_ASR_Microphone_Demo_Cache_Aware_Streaming.ipynb) need to be trained from scratch.

NVIDIA org Apr 10

@leminhnguyen model retraining will be needed.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment