ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token IdentificationJan 1, 2024ยทYefei He,Luoming Zhang,Weijia Wu,Jing Liu,Hong Zhou,Bohan Zhuangยท 0 min read CiteTypeJournal articlePublicationarXiv preprint arXiv:2405.14256Last updated on Jan 1, 2024 ← EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models Jan 1, 2024Binarizing by classification: Is soft function really necessary? Jan 1, 2023 →