-
Large Language Models as Optimizers
Paper β’ 2309.03409 β’ Published β’ 77 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper β’ 2404.02258 β’ Published β’ 107 -
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Paper β’ 2404.14619 β’ Published β’ 126 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper β’ 2404.14219 β’ Published β’ 257
HAN JUNGU
JUNGU
AI & ML interests
None yet
Recent Activity
upvoted
an
article
13 days ago
Gaia2 and ARE: Empowering the community to study agents
liked
a Space
about 1 month ago
mteb/leaderboard
upvoted
a
paper
about 1 month ago
rStar2-Agent: Agentic Reasoning Technical Report