KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing Paper • 2410.18517 • Published Oct 24, 2024 • 1
Are LLMs Aware that Some Questions are not Open-ended? Paper • 2410.00423 • Published Oct 1, 2024 • 1
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference Paper • 2405.12532 • Published May 21, 2024
Learning Better Masking for Better Language Model Pre-training Paper • 2208.10806 • Published Aug 23, 2022
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer Paper • 2307.00360 • Published Jul 1, 2023
RefGPT: Reference -> Truthful & Customized Dialogues Generation by GPTs and for GPTs Paper • 2305.14994 • Published May 24, 2023