Mann Patel

manncodes

AI & ML interests

NLP, Mech Interp, Reasoning, MLSystems

Recent Activity

upvoted an article 4 days ago

KV Cache from scratch in nanoVLM

liked a dataset 5 days ago

tm21cy/NYT-Connections

liked a model 15 days ago

nvidia/AceReason-Nemotron-14B

View all activity

Organizations

None yet

manncodes's activity

upvoted an article 4 days ago

Article

KV Cache from scratch in nanoVLM

and 4 others •

5 days ago

• 58

upvoted a paper 15 days ago

AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published 18 days ago • 30

upvoted an article about 2 months ago

Article

🪆 Introduction to Matryoshka Embedding Models

and 2 others •

Feb 23, 2024

• 126

upvoted 2 articles 2 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

and 3 others •

Mar 12

• 427

Article

Distributed Training with JAX and Flax NNX: A Practical Guide to Sharding

•

Mar 26

• 7

upvoted a paper 3 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 280

upvoted a paper 6 months ago

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 80

upvoted a paper 9 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 147

upvoted a collection over 1 year ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 238

upvoted 2 papers over 1 year ago

TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 70

AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 55