Multi-gpu usage

#3
by modjo-ai - opened

On my 24gb gpu I can only process audio files less than 10 minutes. Is it possible to run it on multiple, e.g. 4 gpus to not have CUDA OOM issues?

NVIDIA org

Sorry this is not possible. But we will soon release a streaming version which will have no limits on the audio length.

modjo-ai changed discussion status to closed

@imedennikov , any update on releasing the streaming model?

NVIDIA org

@leminhnguyen this work is still in progress. We hope to finalize it in Q2'25.

Hi @imedennikov , one more question: Is the streaming model just a code change, or do we need to re-train the sortformer model to incorporate the new streaming mechanism? For example cache-aware streaming ASR (https://github.com/NVIDIA/NeMo/blob/main/tutorials/asr/Online_ASR_Microphone_Demo_Cache_Aware_Streaming.ipynb) need to be trained from scratch.

NVIDIA org

@leminhnguyen model retraining will be needed.

Sign up or log in to comment