File size: 38,632 Bytes
61a2215
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a770607
 
61a2215
e664c09
61a2215
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
92ad359
61a2215
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
---
base_model:
- mistralai/Mistral-Small-3.2-24B-Instruct-2506
language:
- en
- fr
- de
- es
- pt
- it
- ja
- ko
- ru
- zh
- ar
- fa
- id
- ms
- ne
- pl
- ro
- sr
- sv
- tr
- uk
- vi
- hi
- bn
library_name: vllm
license: apache-2.0
inference: false
extra_gated_description: >-
  If you want to learn more about how we process your personal data, please read
  our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
tags:
- vllm
- mistral-common
---

# Magistral Small 1.2

Building upon [Mistral Small 3.2 (2506)](https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506), **with added reasoning capabilities**, undergoing SFT from Magistral Medium traces and RL on top, it's a small, efficient reasoning model with 24B parameters.

Magistral Small can be deployed locally, fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized.

Learn more about Magistral in our [blog post](https://mistral.ai/news/magistral/).

The model was presented in the paper [Magistral](https://huggingface.co/papers/2506.10910).

## Updates compared with [Magistral Small 1.1](https://huggingface.co/mistralai/Magistral-Small-2507)

- **Multimodality**: The model now has a vision encoder and can take multimodal inputs, extending its reasoning capabilities to vision.
- **Performance upgrade**: Magistral Small 1.2 should give you significantly better performance than Magistral Small 1.1 as seen in the [benchmark results](#benchmark-results).
- **Better tone and persona**: You should experience better LaTeX and Markdown formatting, and shorter answers on easy general prompts.
- **Finite generation**: The model is less likely to enter infinite generation loops.
- **Special think tokens**: [THINK] and [/THINK] special tokens encapsulate the reasoning content in a thinking chunk. This makes it easier to parse the reasoning trace and prevents confusion when the `<think>` token is given as a string in the prompt.
- **Reasoning prompt**: The reasoning prompt is given in the system prompt.

## Key Features

- **Reasoning:** Capable of long chains of reasoning traces before providing an answer.
- **Multilingual:** Supports dozens of languages, including English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, and Farsi.
- **Vision**: Vision capabilities enable the model to analyze images and reason based on visual content in addition to text.
- **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes.
- **Context Window:** A 128k context window. Performance *might* degrade past **40k** but Magistral should still give good results. Hence we recommend to leave the maximum model length to 128k and only lower if you encounter low performance.

## Benchmark Results

| Model                    | AIME24 pass@1 | AIME25 pass@1 | GPQA Diamond | Livecodebench (v5) |
|--------------------------|---------------|---------------|--------------|--------------------|
| **Magistral Medium 1.2** | **91.82%**    | **83.48%**    | **76.26%**   | **75.00%**         |
| Magistral Medium 1.1     | 72.03%        | 60.99%        | 71.46%       | 59.35%             |
| Magistral Medium 1.0     | 73.59%        | 64.95%        | 70.83%       | 59.36%             |
| **Magistral Small 1.2**  | **86.14%**    | **77.34%**    | **70.07%**   | **70.88%**         |
| Magistral Small 1.1      | 70.52%        | 62.03%        | 65.78%       | 59.17%             |
| Magistral Small 1.0      | 70.68%        | 62.76%        | 68.18%       | 55.84%             |

## Sampling parameters

Please make sure to use: 
- `top_p`: 0.95
- `temperature`: 0.7
- `max_tokens`: 131072

## Basic Chat Template

We highly recommend including the following system prompt for the best results, you can edit and customise it if needed for your specific use case.

```py
First draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.

Your thinking process must follow the template below:[THINK]Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response. Use the same language as the input.[/THINK]Here, provide a self-contained response.
```

The `[THINK]` and `[/THINK]` are special tokens that **must** be encoded as such.

***Please make sure to use [mistral-common](https://github.com/mistralai/mistral-common) as the source of truth***. Find [below](#usage) examples from libraries supporting `mistral-common`.

We invite you to choose, depending on your use case and requirements, between keeping reasoning traces during multi-turn interactions or keeping only the final assistant response.


## Usage

The model can be used with the following frameworks.

### Inference

- [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [below](#vllm-recommended)
- [`transformers`](https://github.com/huggingface/transformers): See [below](#transformers)
- [`llama.cpp`](https://github.com/ggml-org/llama.cpp): See https://huggingface.co/mistralai/Magistral-Small-2509-GGUF
- [`Unsloth GGUFs`](https://huggingface.co/unsloth): See https://huggingface.co/unsloth/Magistral-Small-2509-GGUF
- [`Kaggle`](https://www.kaggle.com/models): See https://www.kaggle.com/models/mistral-ai/magistral-small-2509
- [`LM Studio`](https://lmstudio.ai/models): See https://lmstudio.ai/models/mistralai/magistral-small-2509


### Fine-tuning

- [`Axolotl`](https://axolotl.ai/): See https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/magistral
- [`Unsloth`](https://unsloth.ai/): See https://docs.unsloth.ai/models/tutorials-how-to-fine-tune-and-run-llms/magistral-how-to-run-and-fine-tune


### vLLM (recommended)

We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
to implement production-ready inference pipelines.

**_Installation_**

Make sure you install the latest [`vLLM`](https://github.com/vllm-project/vllm/) code:

```
pip install --upgrade vllm
```

Doing so should automatically install [`mistral_common >= 1.8.5`](https://github.com/mistralai/mistral-common/releases/tag/v1.8.5).

To check:
```
python -c "import mistral_common; print(mistral_common.__version__)"
```

You can also make use of a ready-to-go [docker image](https://github.com/vllm-project/vllm/blob/main/Dockerfile) or on the [docker hub](https://hub.docker.com/layers/vllm/vllm-openai/latest/images/sha256-de9032a92ffea7b5c007dad80b38fd44aac11eddc31c435f8e52f3b7404bbf39).


Serve model as follows:

```
vllm serve mistralai/Magistral-Small-2509 \
  --reasoning-parser mistral \
  --tokenizer_mode mistral --config_format mistral \
  --load_format mistral --tool-call-parser mistral \
  --enable-auto-tool-choice --limit-mm-per-prompt '{"image":10}' \
  --tensor-parallel-size 2
```

Ping model as follows:

<details>
<summary>Python text snippet</summary>

```python
from typing import Any
from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.7
TOP_P = 0.95
MAX_TOK = 131072

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id

def load_system_prompt(repo_id: str, filename: str) -> dict[str, Any]:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()

    index_begin_think = system_prompt.find("[THINK]")
    index_end_think = system_prompt.find("[/THINK]")

    return {
        "role": "system",
        "content": [
            {"type": "text", "text": system_prompt[:index_begin_think]},
            {
                "type": "thinking",
                "thinking": system_prompt[
                    index_begin_think + len("[THINK]") : index_end_think
                ],
                "closed": True,
            },
            {
                "type": "text",
                "text": system_prompt[index_end_think + len("[/THINK]") :],
            },
        ],
    }

SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")

query = "Use each number in 2,5,6,3 exactly once, along with any combination of +, -, ×, ÷ (and parentheses for grouping), to make the number 24."

messages = [
    SYSTEM_PROMPT,
    {"role": "user", "content": query}
]
stream = client.chat.completions.create(
  model=model,
  messages=messages,
  stream=True,
  temperature=TEMP,
  top_p=TOP_P,
  max_tokens=MAX_TOK,
)

print("client: Start streaming chat completions...:\n")
printed_reasoning_content = False
answer = []

for chunk in stream:
    reasoning_content = None
    content = None
    # Check the content is reasoning_content or content
    if hasattr(chunk.choices[0].delta, "reasoning_content"):
        reasoning_content = chunk.choices[0].delta.reasoning_content
    elif hasattr(chunk.choices[0].delta, "content"):
        content = chunk.choices[0].delta.content

    if reasoning_content is not None:
        if not printed_reasoning_content:
            printed_reasoning_content = True
            print("Start reasoning:\n", end="", flush=True)
        print(reasoning_content, end="", flush=True)
    elif content is not None:
        # Extract and print the content
        if not reasoning_content and printed_reasoning_content:
            answer.extend(content)
        print(content, end="", flush=True)

if answer:
    print("\n\n=============\nAnswer\n=============\n")
    print("".join(answer))
else:
    print("\n\n=============\nNo Answer\n=============\n")
    print("No answer was generated by the model, probably because the maximum number of tokens was reached.")

# client: Start streaming chat completions...:
#
# Start reasoning:
# First, I need to ...
# ...
#
#
# =============
# Answer
# =============
# 
# Here's one way to use the numbers 2, 5, 6, 3 to make 24:
#
#\[
#(6 \div 2) \times (5 + 3) = 3 \times 8 = 24
#\]
#
#Alternatively, another solution is:
#
#\[
#6 \times (5 - 3 + 2) = 6 \times 4 = 24
#\]
#
#Both expressions use each of the numbers 2, 5, 6, 3 exactly once with the operations given.
```

</details>

<details>
<summary>Python text-image snippet: Pokemon</summary>

```python
from typing import Any

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.7
TOP_P = 0.95
MAX_TOK = 131072

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> dict[str, Any]:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()

    index_begin_think = system_prompt.find("[THINK]")
    index_end_think = system_prompt.find("[/THINK]")

    return {
        "role": "system",
        "content": [
            {"type": "text", "text": system_prompt[:index_begin_think]},
            {
                "type": "thinking",
                "thinking": system_prompt[
                    index_begin_think + len("[THINK]") : index_end_think
                ],
                "closed": True,
            },
            {
                "type": "text",
                "text": system_prompt[index_end_think + len("[/THINK]") :],
            },
        ],
    }


model_id = "mistralai/Magistral-Small-2509"
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")

image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"

messages = [
    SYSTEM_PROMPT,
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]


stream = client.chat.completions.create(
    model=model,
    messages=messages,
    stream=True,
    temperature=TEMP,
    top_p=TOP_P,
    max_tokens=MAX_TOK,
)

print("client: Start streaming chat completions...:\n")
printed_reasoning_content = False
answer = []

for chunk in stream:
    reasoning_content = None
    content = None
    # Check the content is reasoning_content or content
    if hasattr(chunk.choices[0].delta, "reasoning_content"):
        reasoning_content = chunk.choices[0].delta.reasoning_content
    elif hasattr(chunk.choices[0].delta, "content"):
        content = chunk.choices[0].delta.content

    if reasoning_content is not None:
        if not printed_reasoning_content:
            printed_reasoning_content = True
            print("Start reasoning:\n", end="", flush=True)
        print(reasoning_content, end="", flush=True)
    elif content is not None:
        # Extract and print the content
        if not reasoning_content and printed_reasoning_content:
            answer.extend(content)
        print(content, end="", flush=True)

if answer:
    print("\n\n=============\nAnswer\n=============\n")
    print("".join(answer))
else:
    print("\n\n=============\nNo Answer\n=============\n")
    print(
        "No answer was generated by the model, probably because the maximum number of tokens was reached."
    )

# client: Start streaming chat completions...:

# Start reasoning:
# In the image, we see a battle scene from a Pokémon game. The player's Pikachu is at full health (83/83 HP), and the opponent's Pidgey is at a lower level (level 17 compared to Pikachu's level 42). The possible actions available to the player are:

# 1. FIGHT: This allows the player to use one of Pikachu's moves to attack Pidgey. Given that Pikachu is at a higher level and has full HP, it is likely that Pikachu would be able to defeat Pidgey easily. This is a good option because it could potentially win the battle quickly and efficiently.

# 2. BAG: This allows the player to use an item from their bag. This could be useful if the player wants to heal Pikachu (though it's not necessary at full health) or use an item to weaken Pidgey. However, since Pikachu is at full health and Pidgey is at a lower level, this might not be necessary. It could be a good option if the player wants to use a special item, but generally, it might not be the best choice in this situation.

# 3. POKÉMON: This allows the player to switch the current Pokémon to another one in their team. Since Pikachu is at full health and at a higher level than Pidgey, switching might not be necessary. It could be useful if the player wants to train a different Pokémon, but it might not be the most efficient choice for winning the battle quickly.

# 4. RUN: This allows the player to flee from the battle. This could be a good option if the player wants to avoid the battle, but since Pikachu is at a clear advantage, running would not be the most efficient choice. It could be useful if the player wants to save time or if they are trying to avoid losing a Pokémon, but in this case, it seems unnecessary.

# Given the circumstances, the best action seems to be to FIGHT, as Pikachu is at a clear advantage in terms of level and health. The other options are not as efficient for winning the battle quickly.In the given scenario, the most appropriate action to take is to FIGHT. Here's why:

# 1. FIGHT: This is the best option because Pikachu is at a higher level and has full health, making it likely to defeat Pidgey quickly and efficiently. Using an attack move would be the most straightforward way to win the battle.

# 2. BAG: While this option could be useful for healing or using special items, it is not necessary since Pikachu is already at full health. This option is less efficient for winning the battle quickly.

# 3. POKÉMON: Switching to another Pokémon might be useful for training a different Pokémon, but it is not necessary since Pikachu is at a clear advantage. This option is not as efficient for winning the current battle.

# 4. RUN: Fleeing from the battle could be useful if the player wants to avoid the battle, but since Pikachu is at a clear advantage, running would not be the most efficient choice. It could be useful if the player wants to save time or avoid losing a Pokémon, but in this case, it seems unnecessary.

# Therefore, the best action to take in this situation is to FIGHT.

# FIGHT

# =============
# Answer
# =============

# In the given scenario, the most appropriate action to take is to FIGHT. Here's why:

# 1. FIGHT: This is the best option because Pikachu is at a higher level and has full health, making it likely to defeat Pidgey quickly and efficiently. Using an attack move would be the most straightforward way to win the battle.

# 2. BAG: While this option could be useful for healing or using special items, it is not necessary since Pikachu is already at full health. This option is less efficient for winning the battle quickly.

# 3. POKÉMON: Switching to another Pokémon might be useful for training a different Pokémon, but it is not necessary since Pikachu is at a clear advantage. This option is not as efficient for winning the current battle.

# 4. RUN: Fleeing from the battle could be useful if the player wants to avoid the battle, but since Pikachu is at a clear advantage, running would not be the most efficient choice. It could be useful if the player wants to save time or avoid losing a Pokémon, but in this case, it seems unnecessary.

# Therefore, the best action to take in this situation is to FIGHT.

# FIGHT
```
</details>

<details>
<summary>Python text-image snippet: Geo trivia</summary>

```python
from typing import Any

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.7
TOP_P = 0.95
MAX_TOK = 131072

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> dict[str, Any]:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()

    index_begin_think = system_prompt.find("[THINK]")
    index_end_think = system_prompt.find("[/THINK]")

    return {
        "role": "system",
        "content": [
            {"type": "text", "text": system_prompt[:index_begin_think]},
            {
                "type": "thinking",
                "thinking": system_prompt[
                    index_begin_think + len("[THINK]") : index_end_think
                ],
                "closed": True,
            },
            {
                "type": "text",
                "text": system_prompt[index_end_think + len("[/THINK]") :],
            },
        ],
    }


model_id = "mistralai/Magistral-Small-2509"
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")

image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/d7/201806_Tianducheng_Bird-eye_View.jpg/1280px-201806_Tianducheng_Bird-eye_View.jpg"

messages = [
    SYSTEM_PROMPT,
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Where has this picture been taken ?",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]


stream = client.chat.completions.create(
    model=model,
    messages=messages,
    stream=True,
    temperature=TEMP,
    top_p=TOP_P,
    max_tokens=MAX_TOK,
)

print("client: Start streaming chat completions...:\n")
printed_reasoning_content = False
answer = []

for chunk in stream:
    reasoning_content = None
    content = None
    # Check the content is reasoning_content or content
    if hasattr(chunk.choices[0].delta, "reasoning_content"):
        reasoning_content = chunk.choices[0].delta.reasoning_content
    elif hasattr(chunk.choices[0].delta, "content"):
        content = chunk.choices[0].delta.content

    if reasoning_content is not None:
        if not printed_reasoning_content:
            printed_reasoning_content = True
            print("Start reasoning:\n", end="", flush=True)
        print(reasoning_content, end="", flush=True)
    elif content is not None:
        # Extract and print the content
        if not reasoning_content and printed_reasoning_content:
            answer.extend(content)
        print(content, end="", flush=True)

if answer:
    print("\n\n=============\nAnswer\n=============\n")
    print("".join(answer))
else:
    print("\n\n=============\nNo Answer\n=============\n")
    print(
        "No answer was generated by the model, probably because the maximum number of tokens was reached."
    )

# client: Start streaming chat completions...:

# Start reasoning:
# The image shows a replica of the Eiffel Tower, but it's not in Paris. The background includes mountains, which are not present in Paris. The surrounding architecture appears to be more modern and dense, which is also not typical of Paris. The combination of the Eiffel Tower replica and the mountainous backdrop suggests that this is likely in a city in China, as China has several replicas of the Eiffel Tower, with the most famous one being in Shanghai. However, the dense residential buildings and the specific layout suggest that this might be in another city in China, possibly Shenzhen or another major city with a similar landscape.

# Given that the question is about identifying the location based on the visual clues, and considering the presence of the Eiffel Tower replica and the mountainous backdrop, it's likely that this is a well-known location in China.

# The most probable answer is that this is in Shenzhen, as it has a well-known Eiffel Tower replica in a park, but to be precise, this is the Eiffel Tower replica in Shenzhen, which is known as the "Shenzhen Park of Eiffel Tower."

# However, to be more accurate, this is likely the Eiffel Tower replica in Shenzhen, as it matches the description and visual elements.The image shows a replica of the Eiffel Tower, which is not in Paris but rather in a city with a mountainous backdrop and modern, dense architecture. This combination of elements is typical of a Chinese city, and the presence of the Eiffel Tower replica suggests a location like Shenzhen, which is known for having such a replica. The dense residential buildings and the specific layout further support this identification. Therefore, the most probable location for this image is Shenzhen, China.

# So, the answer is:

# Shenzhen

# =============
# Answer
# =============

# The image shows a replica of the Eiffel Tower, which is not in Paris but rather in a city with a mountainous backdrop and modern, dense architecture. This combination of elements is typical of a Chinese city, and the presence of the Eiffel Tower replica suggests a location like Shenzhen, which is known for having such a replica. The dense residential buildings and the specific layout further support this identification. Therefore, the most probable location for this image is Shenzhen, China.

# So, the answer is:

# Shenzhen
```
</details>

<details>
<summary>Python text-image snippet: Maths</summary>

```python
from typing import Any

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.7
TOP_P = 0.95
MAX_TOK = 131072

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> dict[str, Any]:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()

    index_begin_think = system_prompt.find("[THINK]")
    index_end_think = system_prompt.find("[/THINK]")

    return {
        "role": "system",
        "content": [
            {"type": "text", "text": system_prompt[:index_begin_think]},
            {
                "type": "thinking",
                "thinking": system_prompt[
                    index_begin_think + len("[THINK]") : index_end_think
                ],
                "closed": True,
            },
            {
                "type": "text",
                "text": system_prompt[index_end_think + len("[/THINK]") :],
            },
        ],
    }


model_id = "mistralai/Magistral-Small-2509"
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")

image_url = "https://i.ytimg.com/vi/5Y3xLHeyKZU/hqdefault.jpg"

messages = [
    SYSTEM_PROMPT,
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Solve the equations. Answer in the language of the image.",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]

stream = client.chat.completions.create(
    model=model,
    messages=messages,
    stream=True,
    temperature=TEMP,
    top_p=TOP_P,
    max_tokens=MAX_TOK,
)

print("client: Start streaming chat completions...:\n")
printed_reasoning_content = False
answer = []

for chunk in stream:
    reasoning_content = None
    content = None
    # Check the content is reasoning_content or content
    if hasattr(chunk.choices[0].delta, "reasoning_content"):
        reasoning_content = chunk.choices[0].delta.reasoning_content
    elif hasattr(chunk.choices[0].delta, "content"):
        content = chunk.choices[0].delta.content

    if reasoning_content is not None:
        if not printed_reasoning_content:
            printed_reasoning_content = True
            print("Start reasoning:\n", end="", flush=True)
        print(reasoning_content, end="", flush=True)
    elif content is not None:
        # Extract and print the content
        if not reasoning_content and printed_reasoning_content:
            answer.extend(content)
        print(content, end="", flush=True)

if answer:
    print("\n\n=============\nAnswer\n=============\n")
    print("".join(answer))
else:
    print("\n\n=============\nNo Answer\n=============\n")
    print(
        "No answer was generated by the model, probably because the maximum number of tokens was reached."
    )
# client: Start streaming chat completions...:

# Start reasoning:
# Je dois résoudre ce système d'équations. Voici les équations :

# 1. \(5x + 2y = -2\)
# 2. \(3x - 4y = 17\)

# D'abord, je pense que la méthode d'élimination pourrait être une bonne approche. Pour cela, je dois éliminer une des variables. Voyons comment.

# Je vais essayer d'éliminer y. Pour cela, je dois que les coefficients de y soient les mêmes (en valeur absolue) dans les deux équations.

# Le coefficient de y dans la première équation est 2, et dans la deuxième, c'est -4. Le plus petit multiple commun de 2 et 4 est 4. Donc, je vais multiplier la première équation par 2 pour que le coefficient de y devienne 4.

# Faisons cela :

# 1. \(2 \times (5x + 2y) = 2 \times (-2)\)
#    Ce qui donne : \(10x + 4y = -4\)

# Maintenant, les équations sont :

# 1. \(10x + 4y = -4\)
# 2. \(3x - 4y = 17\)

# Maintenant, si j'additionne ces deux équations, les termes avec y s'annuleront.

# Faisons l'addition :

# \( (10x + 4y) + (3x - 4y) = -4 + 17 \)

# Ce qui donne : \(13x = 13\)

# Donc, \(x = 1\).

# Maintenant que nous avons x, nous pouvons le substituer dans une des équations originales pour trouver y. Utilisons la première équation originale :

# \(5x + 2y = -2\)

# En substituant x = 1 :

# \(5(1) + 2y = -2\)

# Ce qui donne : \(5 + 2y = -2\)

# Soustraire 5 des deux côtés :

# ...

# Ce qui donne : \(5 + 2y = -2\)

# Soustraire 5 des deux côtés :

# \(2y = -2 - 5\)
# \(2y = -7\)

# Diviser par 2 :

# \(y = -\frac{7}{2}\)

# Donc, la solution est \(x = 1\) et \(y = -\frac{7}{2}\).

# $\boxed{x = 1,\ y = -\frac{7}{2}}$

# =============
# Answer
# =============

# Pour résoudre le système d'équations donné :

# 1. \(5x + 2y = -2\)
# 2. \(3x - 4y = 17\)

# Nous commençons par utiliser la méthode d'élimination pour éliminer une des variables. Nous choisissons d'éliminer \(y\) en rendant ses coefficients identiques en valeur absolue. Le coefficient de \(y\) dans la première équation est 2, et dans la deuxième, c'est -4. Le plus petit multiple commun de 2 et 4 est 4. Nous multiplions donc la première équation par 2 pour que le coefficient de \(y\) devienne 4.

# Faisons cela :

# 1. \(2 \times (5x + 2y) = 2 \times (-2)\)
#    Ce qui donne : \(10x + 4y = -4\)

# Maintenant, les équations sont :

# 1. \(10x + 4y = -4\)
# 2. \(3x - 4y = 17\)

# En additionnant ces deux équations, les termes avec \(y\) s'annuleront :

# \( (10x + 4y) + (3x - 4y) = -4 + 17 \)

# Ce qui donne : \(13x = 13\)

# Donc, \(x = 1\).

# Ensuite, nous substituons \(x = 1\) dans la première équation originale pour trouver \(y\) :

# \(5(1) + 2y = -2\)

# Ce qui donne : \(5 + 2y = -2\)

# Soustraire 5 des deux côtés :

# \(2y = -2 - 5\)
# \(2y = -7\)

# Diviser par 2 :

# \(y = -\frac{7}{2}\)

# Donc, la solution est \(x = 1\) et \(y = -\frac{7}{2}\).

# $\boxed{x = 1,\ y = -\frac{7}{2}}$
```
</details>


### Transformers

Make sure you install the latest [`Transformers`](https://github.com/huggingface/transformers/) version:

```sh
pip install --upgrade transformers[mistral-common]
```

This should also install [`mistral_common >= 1.8.5`](https://github.com/mistralai/mistral-common/releases/tag/v1.8.5)

To check:
```sh
python -c "import mistral_common; print(mistral_common.__version__)"
```

Now you can use Transformers with Magistral:

<details>
<summary>Python snippet</summary>

```python
from typing import Any
import torch

from huggingface_hub import hf_hub_download
from transformers import Mistral3ForConditionalGeneration
from transformers import AutoTokenizer


def load_system_prompt(repo_id: str, filename: str) -> dict[str, Any]:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()

    index_begin_think = system_prompt.find("[THINK]")
    index_end_think = system_prompt.find("[/THINK]")

    return {
        "role": "system",
        "content": [
            {"type": "text", "text": system_prompt[:index_begin_think]},
            {
                "type": "thinking",
                "thinking": system_prompt[
                    index_begin_think + len("[THINK]") : index_end_think
                ],
                "closed": True,
            },
            {
                "type": "text",
                "text": system_prompt[index_end_think + len("[/THINK]") :],
            },
        ],
    }


model_id = "mistralai/Magistral-Small-2509"

tokenizer = AutoTokenizer.from_pretrained(model_id, tokenizer_type="mistral")
model = Mistral3ForConditionalGeneration.from_pretrained(
    model_id, torch_dtype=torch.bfloat16, device_map="auto"
).eval()

SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")
image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"
messages = [
    SYSTEM_PROMPT,
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]

tokenized = tokenizer.apply_chat_template(messages, return_dict=True)

input_ids = torch.tensor(tokenized.input_ids, device="cuda").unsqueeze(0)
attention_mask = torch.tensor(tokenized.attention_mask, device="cuda").unsqueeze(0)
pixel_values = torch.tensor(
    tokenized.pixel_values[0], dtype=torch.bfloat16, device="cuda"
).unsqueeze(0)
image_sizes = torch.tensor(pixel_values.shape[-2:], device="cuda").unsqueeze(0)

with torch.inference_mode():
    output = model.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        pixel_values=pixel_values,
        image_sizes=image_sizes,
    )[0]


decoded_output = tokenizer.decode(
    output[
        len(tokenized.input_ids) : (
            -1 if output[-1] == tokenizer.eos_token_id else len(output)
        )
    ]
)
print(decoded_output)
# [THINK]Alright, let's analyze the image carefully. It's a scene from a Pokémon game. The player is controlling Pikachu, which is at level 42 with full HP (83/83). The opponent is a Pidgey at level 17. The question is asking what action the player should take in this situation.

# First, let's list all the possible actions available. From the bottom of the screen, the options are:
# 1. FIGHT
# 2. BAG
# 3. POKÉMON
# 4. RUN

# Now, let's consider each option:

# 1. **FIGHT**: This means using Pikachu's moves to attack the Pidgey.
#    - Pros: Pikachu is at a higher level (42) compared to Pidgey (17), so it has a significant advantage. Pikachu's HP is full, so it's in good condition to fight. Fighting could potentially win the battle quickly.
#    - Cons: Even though Pikachu is stronger, there's always a risk of Pidgey landing a lucky hit or using a powerful move. However, given the level difference, this is less likely.

# 2. **BAG**: This means using items from the bag to help in the battle.
#    - Pros: Could use a potion to heal (though Pikachu is already at full HP), or use another item like a Poké Ball to try and catch Pidgey.
#    - Cons: Using items might be less efficient than just fighting, especially since Pikachu is already at full health. Also, if the goal is to catch Pidgey, using items to weaken it first might be useful, but the immediate advantage isn't clear.

# 3. **POKÉMON**: This means switching to another Pokémon from the team.
#    - Pros: If the player has another Pokémon that is stronger or has moves that are super effective against Pidgey, this could be useful.
#    - Cons: Pikachu is already at a significant level advantage and is at full health, so switching might not be necessary unless there's a strategic reason (e.g., leveling up another Pokémon).

# 4. **RUN**: This means attempting to flee from the battle.
#    - Pros: If the player wants to avoid the battle for some reason (e.g., saving time, or wanting to catch Pidgey without weakening it), running could be useful.
#    - Cons: If the goal is to catch or defeat Pidgey, running would not help in achieving that. Also, running might not always be successful, especially in a grass encounter where the chance to flee might be lower.

# Now, considering the context, if the goal is to defeat Pidgey, the best action would be to FIGHT since Pikachu is at a significant advantage in terms of level and health. Using the BAG could be useful if the goal is to catch Pidgey, as items like Poké Balls or berries could be used. Switching Pokémon doesn't seem necessary given Pikachu's current status. Running would only be useful if the player wants to avoid the battle, which seems unlikely if the goal is to progress in the game.

# However, the question asks for the best action without specifying the goal. Assuming the goal is to defeat Pidgey efficiently, the best action would be to FIGHT.

# But to list all possible actions and their pros and cons:

# 1. **FIGHT**
#    - Pros: High chance of winning quickly due to level and health advantage.
#    - Cons: Small risk of Pikachu getting damaged.

# 2. **BAG**
#    - Pros: Could use items to catch Pidgey or support Pikachu (though not necessary for healing).
#    - Cons: Might be less efficient if the goal is just to defeat Pidgey.

# 3. **POKÉMON**
#    - Pros: Could switch to another Pokémon for strategic reasons (e.g., type advantage, training a different Pokémon).
#    - Cons: Pikachu is already at a strong position, so switching might be unnecessary unless there's another goal.

# 4. **RUN**
#    - Pros: Escape the battle if needed.
#    - Cons: Does not contribute to progress if the goal is to defeat or catch Pidgey.

# Given that, if the goal is to defeat Pidgey, the best action is to FIGHT. If the goal is to catch Pidgey, using the BAG to throw a Poké Ball could be the best action.[/THINK]Here are the possible actions and their explanations:

# 1. **FIGHT**
#    - **Pros**: Pikachu is at a much higher level (42) compared to Pidgey (17) and has full HP. This makes it very likely to win quickly.
#    - **Cons**: There's always a small risk that Pidgey could land a lucky hit, but this is unlikely given the level difference.

# 2. **BAG**
#    - **Pros**: You could use items to catch Pidgey (e.g., Poké Ball) or support Pikachu (though Pikachu is already at full HP).
#    - **Cons**: Using items might be less efficient if the goal is simply to defeat Pidgey, as fighting directly could be quicker.

# 3. **POKÉMON**
#    - **Pros**: You could switch to another Pokémon that might have a type advantage or that you want to train.
#    - **Cons**: Pikachu is already in a strong position to defeat Pidgey, so switching might not be necessary unless there's another strategic reason.

# 4. **RUN**
#    - **Pros**: You can escape the battle if you need to, for example, if you want to preserve Pikachu's health for a tougher battle ahead.
#    - **Cons**: Running doesn't help you progress if your goal is to defeat or catch Pidgey. Additionally, the success rate for running might be lower in a grass encounter.

# Given these considerations, if your goal is to defeat Pidgey, the best action is likely to **FIGHT**, as Pikachu is at a significant advantage. If your goal is to catch Pidgey, using the **BAG** to throw a Poké Ball could be the best choice. If you're looking to train a different Pokémon, you might consider switching with **POKÉMON**, and if you need to preserve resources or Pikachu's health, **RUN** could be an option.
```

</details>