mengfanxu's picture

mengfanxu

fxmeng

·

https://fxmeng.github.io

fxmeng

AI & ML interests

None yet

Recent Activity

liked a dataset 13 days ago

nvidia/Llama-Nemotron-VLM-Dataset-v1

authored a paper 13 days ago

TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill \& Decode Inference

commented on a paper 14 days ago

TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill \& Decode Inference

View all activity

Organizations

None yet

commented a paper 14 days ago

TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill \& Decode Inference

Paper • 2508.15881 • Published 18 days ago • 8 •

commented 3 papers 7 months ago

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 57 •

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 57 •

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 57 •

New activity in MMMU/MMMU almost 2 years ago

Question about "Text as Input"

#4 opened almost 2 years ago by