16-Bit to 1-Bit: Visual KV Cache Quantization for Efficient Multimodal LLMs arxiv.org 73 points by PaulHoule 4 days ago
Have they published their code?
[dead]
[dead]
[dead]
[flagged]