ToP Collection Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference • 16 items • Updated 18 days ago