TinySwallow
Collection
Compact Japanese models trained with "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"
•
5 items
•
Updated
•
16