Yuxian Gu

t1101675

AI & ML interests

Efficient methods for language models

Recent Activity

Organizations

Conversational AI (CoAI) group from Tsinghua University's profile picture Efficient-Large-Model's profile picture MiniLLM's profile picture Data Selection's profile picture VILA / Molmo's profile picture

t1101675's activity

New activity in MiniLLM/MiniLLM-gpt2-340M about 1 month ago
New activity in MiniLLM/SFT-gpt2-120M about 1 month ago
New activity in MiniLLM/SFT-gpt2-760M about 1 month ago
New activity in Data-Selection/PDS-470M about 1 month ago
New activity in Data-Selection/PDS-160M about 1 month ago

Add link to paper

#2 opened about 1 month ago by
nielsr
New activity in Data-Selection/PDS-470M about 1 month ago
New activity in Data-Selection/PDS-1B about 1 month ago

Add link to code repository

#2 opened about 1 month ago by
nielsr
New activity in Data-Selection/PDS-1.7B about 1 month ago
New activity in Data-Selection/BSL-1.7B about 1 month ago

Add link to code

#2 opened about 1 month ago by
nielsr
New activity in MiniLLM/MiniPLM-Qwen-500M about 1 month ago
New activity in MiniLLM/MiniPLM-llama3.1-212M about 1 month ago
New activity in MiniLLM/MiniPLM-Mamba-130M about 1 month ago

Improve MiniPLM-Mamba-130M model card

#1 opened about 1 month ago by
nielsr
New activity in MiniLLM/MiniPLM-Qwen-1.2B about 1 month ago

Add link to code

#1 opened about 1 month ago by
nielsr
New activity in MiniLLM/Ref-Pretrain-Qwen-104M about 1 month ago

Add link to code

#1 opened about 1 month ago by
nielsr
New activity in MiniLLM/Pretrain-Qwen-1.2B about 1 month ago

Add link to code

#1 opened about 1 month ago by
nielsr
New activity in MiniLLM/Pretrain-Qwen-500M about 1 month ago

No changes needed

#1 opened about 1 month ago by
nielsr
New activity in MiniLLM/Pretrain-Qwen-200M about 1 month ago

Add link to code

#1 opened about 1 month ago by
nielsr
New activity in MiniLLM/VanillaKD-Pretrain-Qwen-200M about 1 month ago

Add link to code and base model tag

#1 opened about 1 month ago by
nielsr
New activity in MiniLLM/VanillaKD-Pretrain-Qwen-500M about 1 month ago

Add link to code

#1 opened about 1 month ago by
nielsr