Base Model for TransMLA
mengfanxu
fxmeng
AI & ML interests
None yet
Recent Activity
liked
a dataset
about 1 month ago
nvidia/Llama-Nemotron-VLM-Dataset-v1
authored
a paper
about 1 month ago
TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated
Prefill \& Decode Inference
commented on
a paper
about 1 month ago
TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated
Prefill \& Decode Inference
Organizations
None yet