min's picture

1 3 1

min

qiyang-attn

velconia

AI & ML interests

GNN, LLM, Generative Models, MultiModal, Recommendation Models

Recent Activity

upvoted a paper 4 days ago

Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts

authored a paper 5 days ago

Frac-Connections: Fractional Extension of Hyper-Connections

authored a paper about 2 months ago

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

View all activity

Organizations

None yet

qiyang-attn's activity

upvoted a paper 4 days ago

Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts

Paper • 2503.16057 • Published 4 days ago • 14

authored a paper 5 days ago

Frac-Connections: Fractional Extension of Hyper-Connections

Paper • 2503.14125 • Published 7 days ago • 19

authored a paper about 2 months ago

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Paper • 2501.16975 • Published Jan 28 • 27

authored a paper 4 months ago

Ultra-Sparse Memory Network

Paper • 2411.12364 • Published Nov 19, 2024 • 23

upvoted 2 papers 4 months ago

Ultra-Sparse Memory Network

Paper • 2411.12364 • Published Nov 19, 2024 • 23

Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Paper • 2411.03884 • Published Nov 6, 2024 • 28

authored a paper 6 months ago

Hyper-Connections

Paper • 2409.19606 • Published Sep 29, 2024 • 23

updated a collection about 1 year ago

LLM-LongContext-Compression

Papers in terms of LLM LongContext Compression way, Reading Details: https://www.notion.so/LLM-LongContext-Compression-323cc6da39124c3a97d3502e1bf61b7 • 15 items • Updated Mar 13, 2024 • 1