BinghengWu's picture

3 6 10

BinghengWu

wubingheng

·

https://github.com/wubingheng111

AI & ML interests

I like to fine-tune the small models of the Doge series.

Recent Activity

authored a paper about 1 month ago

Trainable Dynamic Mask Sparse Attention

upvoted an article about 1 month ago

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

published an article about 1 month ago

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

View all activity

Organizations

upvoted an article about 1 month ago

Article

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

By

and 2 others •

Aug 5

• 6

upvoted a collection about 1 month ago

🧐Small-Papers

Technical support for the SmallDoges series models. • 2 items • Updated Aug 5 • 2

upvoted a paper about 1 month ago

Trainable Dynamic Mask Sparse Attention

Paper • 2508.02124 • Published Aug 4 • 16

upvoted a collection about 2 months ago

🧠 SmolLM3

Smol, multilingual, long-context reasoner • 12 items • Updated Aug 5 • 72

upvoted a collection 8 months ago

Doge

Doge family of small language models. • 12 items • Updated Mar 28 • 6

upvoted a paper 9 months ago

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Paper • 2412.11834 • Published Dec 16, 2024 • 8