Janus-Pro: UnifiedMultimodalUnderstanding and Generation with Data and Model Scaling
논문 링크 Janus-Pro 7B: Dual-Encoder Multimodal LLM That Outsmarts Bigger Models 한 줄 요약 (TL;DR) SigLIP 이해 인코더 + VQ 생성 인코더를 완전히 분리한 뒤 7 B …
31 분
DeepSeek
2501.17811v1
Janus-Pro
Dual-Encoder
Multimodal Learning
Vision-Language Models
Text-to-Image
Image Understanding
Large Language Models
Adapter Networks
Visual Tokenization
GenEval
MMBench
DPG-Bench
DeepSeek-LLM
Efficient Training
Synthetic Data