kas's picture

kas

shing3232

·

AI & ML interests

None yet

Recent Activity

updated a collection about 1 month ago

new activity about 1 month ago

Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4:Int4为什么比没量化的float32和float16还慢

upvoted a paper about 2 months ago

TransMLA: Multi-head Latent Attention Is All You Need

View all activity

Organizations

None yet

shing3232's activity

upvoted a paper about 2 months ago

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 55

upvoted an article 2 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

By

and 5 others •

Sep 18, 2024

• 249

upvoted 2 papers 2 months ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published Apr 8 • 110

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 25

upvoted a collection 9 months ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Apr 28 • 321

upvoted 3 papers about 1 year ago

BASS: Batched Attention-optimized Speculative Sampling

Paper • 2404.15778 • Published Apr 24, 2024 • 11

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4, 2024 • 99

ChatEDA: A Large Language Model Powered Autonomous Agent for EDA

Paper • 2308.10204 • Published Aug 20, 2023 • 1

upvoted a collection about 1 year ago

Camelidae

4 items • Updated Feb 15 • 2

upvoted 2 collections over 1 year ago

Microsoft Research Papers

#PapersToRead from Microsoft Research in the broad space of Generative AI, Multi-agent systems, responsible AI practices, LLM Ops, and language models • 20 items • Updated Jun 26, 2024 • 5

Papers

Large Language Model (LLM) and NLP related papers. • 294 items • Updated 5 days ago • 12

upvoted 2 papers over 1 year ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 618

LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4, 2024 • 39