Update README.md
Browse files
README.md
CHANGED
@@ -35,9 +35,26 @@ No preference tuning has been applied to this revision of the model.
|
|
35 |
- **Repository:** https://ultravox.ai
|
36 |
- **Demo:** See repo
|
37 |
|
38 |
-
##
|
39 |
|
40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
|
42 |
|
43 |
## Training Details
|
|
|
35 |
- **Repository:** https://ultravox.ai
|
36 |
- **Demo:** See repo
|
37 |
|
38 |
+
## Usage
|
39 |
|
40 |
+
Think of the model as an LLM that can also hear and understand speech. As such, it can be used as a voice agent, and also to do speech-to-speech translation, analysis of spoken audio, etc.
|
41 |
+
|
42 |
+
To use the model, try the following:
|
43 |
+
```python
|
44 |
+
# pip install transformers peft librosa
|
45 |
+
|
46 |
+
import transformers
|
47 |
+
import numpy as np
|
48 |
+
import librosa
|
49 |
+
|
50 |
+
pipe = transformers.pipeline(model='fixie-ai/ultravox-v0_2', trust_remote_code=True)
|
51 |
+
|
52 |
+
path = "<path-to-input-audio>" # TODO: pass the audio here
|
53 |
+
audio, sr = librosa.load(path, sr=16000)
|
54 |
+
|
55 |
+
|
56 |
+
pipe({'audio': audio_array, prompt='<|audio|>', 'sampling_rate': sr}, max_new_tokens=30)
|
57 |
+
```
|
58 |
|
59 |
|
60 |
## Training Details
|