patrickvonplaten commited on
Commit
a27970d
verified
1 Parent(s): c9a314d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -6
README.md CHANGED
@@ -120,6 +120,12 @@ vllm serve mistralai/Voxtral-Small-24B-2507 --tokenizer_mode mistral --config_fo
120
 
121
  Leverage the audio capabilities of Voxtral-Small-24B-2507 to chat.
122
 
 
 
 
 
 
 
123
  <details>
124
  <summary>Python snippet</summary>
125
 
@@ -149,7 +155,7 @@ def file_to_chunk(file: str) -> AudioChunk:
149
  audio = Audio.from_file(file, strict=False)
150
  return AudioChunk.from_audio(audio)
151
 
152
- text_chunk = TextChunk(text="Which speaker do you prefer between the two? Why? How are they different from each other?")
153
  user_msg = UserMessage(content=[file_to_chunk(obama_file), file_to_chunk(bcn_file), text_chunk]).to_openai()
154
 
155
  print(30 * "=" + "USER 1" + 30 * "=")
@@ -167,11 +173,12 @@ content = response.choices[0].message.content
167
  print(30 * "=" + "BOT 1" + 30 * "=")
168
  print(content)
169
  print("\n\n")
170
- # E.g. The speaker who delivers the farewell address is more engaging and inspiring.
171
- # They express gratitude and optimism, emphasizing the importance of self-government and citizenship.
172
- # They also share personal experiences and observations, making the speech more relatable and heartfelt.
173
- # In contrast, the second speaker provides factual information about the weather in Barcelona,
174
- # which is less engaging and lacks the emotional depth of the first speaker's address.
 
175
 
176
  messages = [
177
  user_msg,
@@ -198,6 +205,12 @@ print(content)
198
 
199
  Voxtral-Small-24B-2507 has powerful transcription capabilities!
200
 
 
 
 
 
 
 
201
  <details>
202
  <summary>Python snippet</summary>
203
 
 
120
 
121
  Leverage the audio capabilities of Voxtral-Small-24B-2507 to chat.
122
 
123
+ Make sure that your client has `mistral-common` with audio installed:
124
+
125
+ ```sh
126
+ pip install --upgrade mistral_common[audio]
127
+ ```
128
+
129
  <details>
130
  <summary>Python snippet</summary>
131
 
 
155
  audio = Audio.from_file(file, strict=False)
156
  return AudioChunk.from_audio(audio)
157
 
158
+ text_chunk = TextChunk(text="Which speaker is more inspiring? Why? How are they different from each other? Answer in French.")
159
  user_msg = UserMessage(content=[file_to_chunk(obama_file), file_to_chunk(bcn_file), text_chunk]).to_openai()
160
 
161
  print(30 * "=" + "USER 1" + 30 * "=")
 
173
  print(30 * "=" + "BOT 1" + 30 * "=")
174
  print(content)
175
  print("\n\n")
176
+ # The model could give the following answer:
177
+ # ```L'orateur le plus inspirant est le pr茅sident.
178
+ # Il est plus inspirant parce qu'il parle de ses exp茅riences personnelles
179
+ # et de son optimisme pour l'avenir du pays.
180
+ # Il est diff茅rent de l'autre orateur car il ne parle pas de la m茅t茅o,
181
+ # mais plut么t de ses interactions avec les gens et de son r么le en tant que pr茅sident.```
182
 
183
  messages = [
184
  user_msg,
 
205
 
206
  Voxtral-Small-24B-2507 has powerful transcription capabilities!
207
 
208
+ Make sure that your client has `mistral-common` with audio installed:
209
+
210
+ ```sh
211
+ pip install --upgrade mistral_common[audio]
212
+ ```
213
+
214
  <details>
215
  <summary>Python snippet</summary>
216