16-Bit to 1-Bit: Visual KV Cache Quantization for Efficient Multimodal LLMs March 5, 2025 by kamal Comments