Pasha

pashak

khosravipasha

AI & ML interests

None yet

Recent Activity

upvoted a collection 13 days ago

Teacher Logits

liked a model 13 days ago

arcee-ai/Trinity-Mini

upvoted an article about 1 month ago

Continuous batching from first principles

View all activity

Organizations

upvoted a collection 13 days ago

Teacher Logits

Collection

Logits captured from large models to act as the teacher for distillation • 3 items • Updated 13 days ago • 7

liked a model 13 days ago

arcee-ai/Trinity-Mini

Text Generation • 26B • Updated 17 days ago • 4.39k • 163

upvoted an article about 1 month ago

Article

Continuous batching from first principles

Nov 25

•

288

liked 5 models about 1 month ago

liked a model about 2 months ago

nvidia/gpt-oss-120b-Eagle3-long-context

Text Generation • 0.2B • Updated 16 days ago • 11k • 53

liked 2 models 2 months ago

Qwen/Qwen-Image

Text-to-Image • Updated Aug 18 • 247k • • 2.31k

nvidia/Llama-3.1-8B-Instruct-NVFP4

5B • Updated Sep 15 • 12.5k • 4

upvoted an article 2 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

•

272

liked 2 models 3 months ago

deepseek-ai/DeepSeek-V3.2-Exp

Text Generation • 685B • Updated Nov 18 • 77.7k • • 928

Qwen/Qwen3-8B

Text Generation • 8B • Updated Jul 26 • 4.36M • • 833

liked a Space 3 months ago

The Ultra-Scale Playbook

🌌

3.6k

The ultimate guide to training LLM on large GPU Clusters

liked a model 3 months ago

marin-community/marin-8b-instruct

Text Generation • 8B • Updated May 19 • 858 • • 26

liked a dataset 3 months ago

HuggingFaceFW/finepdfs

Viewer • Updated 26 days ago • 476M • 31k • 689

updated a model 4 months ago

pashak/Llama-3.1-8B-Instruct-Q2_K-GGUF

Text Generation • 8B • Updated Sep 4 • 6

published a model 4 months ago

pashak/Llama-3.1-8B-Instruct-Q2_K-GGUF

Text Generation • 8B • Updated Sep 4 • 6

upvoted an article 4 months ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23, 2024

•

241

Pasha

AI & ML interests

Recent Activity

Organizations

pashak's activity

Continuous batching from first principles

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

The Ultra-Scale Playbook

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context