vllm (pretrained=/root/autodl-tmp/cogito-v1-preview-qwen-32B,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match ↑ 0.900 ± 0.0190
strict-match 5 exact_match ↑ 0.948 ± 0.0141

vllm (pretrained=/root/autodl-tmp/cogito-v1-preview-qwen-32B,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match ↑ 0.894 ± 0.0138
strict-match 5 exact_match ↑ 0.930 ± 0.0114

vllm (pretrained=/root/autodl-tmp/cogito-v1-preview-qwen-32B,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 5.0, num_fewshot: None, batch_size: 1

Groups Version Filter n-shot Metric Value Stderr
mmlu 2 none acc ↑ 0.8947 ± 0.0175
- humanities 2 none acc ↑ 0.9231 ± 0.0308
- other 2 none acc ↑ 0.8769 ± 0.0407
- social sciences 2 none acc ↑ 0.9167 ± 0.0354
- stem 2 none acc ↑ 0.8737 ± 0.0324

vllm (pretrained=/root/autodl-tmp/cogito-v1-preview-qwen-32B-awq,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match ↑ 0.924 ± 0.0168
strict-match 5 exact_match ↑ 0.936 ± 0.0155

vllm (pretrained=/root/autodl-tmp/cogito-v1-preview-qwen-32B-awq,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match ↑ 0.920 ± 0.0121
strict-match 5 exact_match ↑ 0.934 ± 0.0111

vllm (pretrained=/root/autodl-tmp/cogito-v1-preview-qwen-32B-awq,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 5.0, num_fewshot: None, batch_size: 1

Groups Version Filter n-shot Metric Value Stderr
mmlu 2 none acc ↑ 0.8982 ± 0.0170
- humanities 2 none acc ↑ 0.8769 ± 0.0377
- other 2 none acc ↑ 0.8769 ± 0.0407
- social sciences 2 none acc ↑ 0.9500 ± 0.0289
- stem 2 none acc ↑ 0.8947 ± 0.0288
Downloads last month
0
Safetensors
Model size
5.73B params
Tensor type
I32
·
BF16
·
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for noneUsername/cogito-v1-preview-qwen-32B-awq

Base model

Qwen/Qwen2.5-32B
Quantized
(19)
this model