xiaoqijian's picture

4 2

xiaoqijian

mx1024

·

AI & ML interests

None yet

Recent Activity

authored a paper 28 days ago

Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance

authored a paper 28 days ago

Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design

upvoted a paper 28 days ago

Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design

View all activity

Organizations

authored 2 papers 28 days ago

Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance

Paper • 2502.12459 • Published Feb 18

Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design

Paper • 2506.04734 • Published about 1 month ago • 19

upvoted a paper 28 days ago

Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design

Paper • 2506.04734 • Published about 1 month ago • 19

upvoted a collection 2 months ago

Qwen3

72 items • Updated 20 days ago • 824

upvoted a collection 4 months ago

TinyR1

2 items • Updated Apr 21 • 3

commented on Open R1: Update #3 4 months ago

How is packing implemented in your code? Have you tried using a 4D attention mask to avoid the overlap between samples that you mentioned?

upvoted an article 4 months ago

Article

Open R1: Update #3

By

and 9 others •

Mar 11

• 294

liked a model 4 months ago

qihoo360/TinyR1-32B-Preview

Text Generation • 33B • Updated Apr 16 • 4.1k • 328

liked a model about 1 year ago

qihoo360/360Zhinao-7B-Base

Text Generation • 8B • Updated Apr 16, 2024 • 128 • 5