AI & ML interests

Knowledge Distillation, Pruning, Quantization, KV Cache Compression, Latency, Inference Speed

Recent Activity