Minsoo Kim's picture

2 14 5

Minsoo Kim

minsoo2333

·

https://marsjacobs.github.io

AI & ML interests

LLM compression

Recent Activity

authored a paper 3 days ago

RILQ: Rank-Insensitive LoRA-based Quantization Error Compensation for Boosting 2-bit Large Language Model Accuracy

authored a paper 3 days ago

InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding

upvoted a paper 4 days ago

InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding

View all activity

Organizations

None yet

authored 2 papers 3 days ago

RILQ: Rank-Insensitive LoRA-based Quantization Error Compensation for Boosting 2-bit Large Language Model Accuracy

Paper • 2412.01129 • Published Dec 2, 2024

InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding

Paper • 2506.15745 • Published 9 days ago • 11

authored 3 papers 9 months ago

Enhancing Computation Efficiency in Large Language Models through Weight and Activation Quantization

Paper • 2311.05161 • Published Nov 9, 2023

Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment

Paper • 2407.03051 • Published Jul 3, 2024

InfiniPot: Infinite Context Processing on Memory-Constrained LLMs

Paper • 2410.01518 • Published Oct 2, 2024 • 3

authored 3 papers almost 2 years ago

Token-Scaled Logit Distillation for Ternary Weight Generative Language Models

Paper • 2308.06744 • Published Aug 13, 2023 • 1

Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer Encoders

Paper • 2211.11014 • Published Nov 20, 2022 • 1

Teacher Intervention: Improving Convergence of Quantization Aware Training for Ultra-Low Precision Transformers

Paper • 2302.11812 • Published Feb 23, 2023