Run with CUDA
Is there a way to run with CUDA? It automatically selected cpu when I launch with ggc s2 and I couldn't find a way. Thanks.
Thanks, I realise there is some issue with my torch thats the reason why it didn't use cuda automaticly
How to run a specific quantisized model?
should rebuild the inference or might need to build a new engine but no manpower to work on this project/task recently
So the current engine, can just inference the fp16? you quantisized them but no way to run inference on them right?
the two new models work faster; try them first; this one will handle later
the two new models work faster; try them first; this one will handle later
I use yours with the following code:
from dia.model import Dia
model = Dia.from_pretrained("callgg/dia-f16", compute_dtype="float16")
text = "[S1] Lens is a deep-tech AI company redefining how large language models think, reason, and interact with the world. Today’s portfolio performance aligns with my core belief: exceptional businesses with enduring growth narratives outperform over time. The robust gains in Alphabet (GOOG, GOOGL) reflect its dominance in digital advertising and accelerating momentum in AI-driven cloud services. "
# path to your prompt audioA
prompt_path = "mark.mp3"
# generate, supplying the prompt path
output = model.generate(
text,
audio_prompt = prompt_path,
verbose=True,
)
model.save_audio("lens/test1.mp3", output)
And yes the inference was a bit faster. But I'm interested in using the quantisized versions you uploded. So basically I need to build a new engine right in order to use them, cause I want to use Dia for streaming, I have tested the mlx quantisized and is much faster, but not fast enough for streaming? Also I tried the mmwillet but is slower the 4 bit one. Here is the output if you're interested:
$ ./build/bin/Release/tts-cli.exe \
--model-path Dia_Q4.gguf \
--prompt "Today's portfolio performance aligns with my core belief: exceptional businesses with enduring growth narratives outperform over time. The robust gains in Alphabet (GOOG, GOOGL) reflect its dominance in digital advertising and accelerating momentum in AI-driven cloud services." \
--save-path ./test.wav
Writing audio file: ./test.wav
|======================================|
Num Channels: 1
Num Samples Per Channel: 513536
Sample Rate: 44100
Bit Depth: 16
Length in Seconds: 11.6448
|======================================|
total time = 124307.80 ms
Thank you for your help.