AI & ML interests
Knowledge Distillation, Pruning, Quantization, KV Cache Compression, Latency, Inference Speed
DistAya
's models
None public yet
Knowledge Distillation, Pruning, Quantization, KV Cache Compression, Latency, Inference Speed