27 23 99

Kiran Kamble

kiranr

ki6an

AI & ML interests

nlp,llm

Recent Activity

liked a dataset about 4 hours ago

lmarena-ai/arena-human-preference-140k

liked a model 18 days ago

zai-org/GLM-4.5-Base

upvoted an article 25 days ago

How Long Prompts Block Other Requests - Optimizing LLM Performance

View all activity

Organizations

upvoted an article 25 days ago

Article

How Long Prompts Block Other Requests - Optimizing LLM Performance

•

Jun 12

• 5

upvoted a paper about 1 month ago

Data Efficacy for Language Model Training

Paper • 2506.21545 • Published Jun 26 • 11

upvoted a paper 2 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 268

upvoted a paper 6 months ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10 • 132

upvoted 2 papers 7 months ago

Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published Jan 26 • 64

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 72

upvoted a paper 12 months ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 143

upvoted an article 12 months ago

Article

Using Writer Framework with Hugging Face Spaces

•

Aug 20, 2024

• 30

upvoted 3 papers 12 months ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 64

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 43

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15, 2024 • 60

upvoted a collection about 1 year ago

DCLM

Collection

DCLM Models + Datasets • 6 items • Updated Oct 4, 2024 • 26

upvoted 6 papers over 1 year ago

ReALM: Reference Resolution As Language Modeling

Paper • 2403.20329 • Published Mar 29, 2024 • 23

Linear Transformers with Learnable Kernel Functions are Better In-Context Models

Paper • 2402.10644 • Published Feb 16, 2024 • 82

Speculative Streaming: Fast LLM Inference without Auxiliary Models

Paper • 2402.11131 • Published Feb 16, 2024 • 44

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Paper • 2402.04291 • Published Feb 6, 2024 • 51

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 127

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 33

upvoted 2 collections over 1 year ago

Papers about model merging

Collection

referenced in the mergekit repo: https://github.com/cg123/mergekit • 4 items • Updated Feb 13, 2024 • 14

Llamafied Yi

Collection

Yi base models converted to Llama architecture. • 4 items • Updated Nov 14, 2023 • 9

Kiran Kamble

AI & ML interests

Recent Activity

Organizations

kiranr's activity

How Long Prompts Block Other Requests - Optimizing LLM Performance

Using Writer Framework with Hugging Face Spaces