47 7 206

sometimesanotion

https://ko-fi.com/sometimesanotion

AI & ML interests

Agentic LLM services, model merging, finetunes, distillation

Recent Activity

new activity 12 days ago

jpacifico/Chocolatine-2-14B-Instruct-v2.0.3:A tour of 14B finetuning

liked a model 16 days ago

LiquidAI/LFM2-VL-1.6B

liked a model 23 days ago

Sunbird/Sunflower-14B

View all activity

Organizations

upvoted 2 collections 3 months ago

GPT-OSS Pruned Experts (4.2B-20B) [IF, Science, Math, etc.]

Collection

Complete collection of domain-specialized GPT-OSS models (1-32 experts) optimized for science, math, medicine, law, safety, and instruction following. • 8 items • Updated Aug 13 • 10

GPT-OSS General (4.2B to 20B)

Collection

Collection of pruned GPT-OSS models spanning 1-32 experts, maintaining general capabilities across domains while reducing computational requirements. • 29 items • Updated Aug 13 • 8

upvoted an article 5 months ago

Article

All LLMs Will Be Sparse BitNet Hybrids

•

May 14

• 16

upvoted an article 9 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

Jan 30

• 156

upvoted a collection 10 months ago

Open LLM Leaderboard best models ❤️‍🔥

Collection

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20 • 647

upvoted a paper 11 months ago

Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 42

upvoted a paper 12 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 49

sometimesanotion

AI & ML interests

Recent Activity

Organizations

sometimesanotion's activity

All LLMs Will Be Sparse BitNet Hybrids

KV Caching Explained: Optimizing Transformer Inference Efficiency