Biography
Publication

Recent & Upcoming Talks
- Example Talk
Publications
Projects
Blog
Projects
Experience
Teaching
- Learn JavaScript
- Learn Python

MiniCache: KV Cache Compression in Depth Dimension for Large Language Models

Apr 1, 2024·

Akide Liu

,

Jing Liu

,

Zizheng Pan

,

Yefei He

,

Gholamreza Haffari

,

Bohan Zhuang

· 0 min read

Type

Journal article

Publication

NeurIPS 2024

Last updated on Apr 1, 2024

← ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification May 25, 2024

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models Sep 30, 2023 →

© 2026 Me. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.