Fanqi Wan

Wanfq

·

https://fanqiwan.github.io/

AI & ML interests

Large Language Models, Model Fusion, Reasoning, Alignment

Organizations

New activity in stepfun-ai/Step-3.5-Flash-SFT 4 months ago

⚠️ Benchmark Leaks

#12 opened 4 months ago by

New activity in Tongyi-Zhiwen/QwenLong-L1-32B about 1 year ago

Is something wrong with the chat_template?

#5 opened about 1 year ago by

Add link to project page

#6 opened about 1 year ago by

provide int4 version pls

#2 opened about 1 year ago by

This model is not trained for function calling, right?

#1 opened about 1 year ago by

Update library_name to transformers

#4 opened about 1 year ago by

commented a paper about 1 year ago

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 89 •

New activity in Wanfq/Explore-LM-Ext-7B-Math about 1 year ago

Adding `safetensors` variant of this model

#1 opened about 1 year ago by

New activity in FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview over 1 year ago

QwQ-32B

#8 opened over 1 year ago by

New activity in FuseAI/FuseChat-Llama-3.1-8B-Instruct over 1 year ago

Add missing metadata: library_name, pipeline_tag, license

#2 opened over 1 year ago by

commented a paper over 1 year ago

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

Paper • 2503.04222 • Published Mar 6, 2025 • 15 •

New activity in FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview over 1 year ago

what is the context lenght?

#6 opened over 1 year ago by

AlgorithmicKing

New activity in FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview over 1 year ago

Model Issue

#1 opened over 1 year ago by

New activity in FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview over 1 year ago

Temperature's effect on the performance of long chain reasoning models. Why was 0.7 used for the evals?

#6 opened over 1 year ago by

New activity in FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview over 1 year ago

DeepSeek-R1-UD-IQ1_S merge

#3 opened over 1 year ago by

Tool use

#4 opened over 1 year ago by

New activity in FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview over 1 year ago

Broken template for this version?

#2 opened over 1 year ago by

New activity in FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview over 1 year ago

Question about replicating the merges

#2 opened over 1 year ago by

Flash

#1 opened over 1 year ago by

New activity in FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview over 1 year ago

License of your model

#4 opened over 1 year ago by