KV-Cache Compression

'KV-Cache Compression' 태그의 모든 글

총 1개의 글

시간순 정렬

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

논문 링크 DeepSeek-VL2 — “작고 빠르면서 고해상도까지 정확한” 멀티모달 LLM 한 줄 요약 (TL;DR) Dynamic Tiling × MLA-MoE × 800 B VL 데이터라는 세 축의 설계로, 4.5 B …

2025년 07월 05일

2412.10302v1 DeepSeek Multimodal Learning Vision-Language Models High-Resolution Image Processing Dynamic Tiling Mixture of Experts (MoE) KV-Cache Compression Multi-head Latent Attention (MLA) Visual Grounding OCR Parameter Efficiency LLM Inference Optimization Edge AI Open Source Models Document Understanding Infographic QA Chart and Table QA Visual Reasoning Multilingual VQA Conversational AI with Images