![[논문리뷰] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality](https://icml.cc/media/PosterPDFs/ICML%202024/32613.png)
[논문리뷰] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
논문 링크 Structured State Space Duality: Mamba-2로 본 SSM↔Attention 통일과 2–8× 가속 한 줄 요약 (TL;DR) SSD(Structured State-Space Duality) 는 SSM과 마스킹 어텐션 …
42 분
Mamba
Mamba-2
Structured State Space Duality
SSD
State Space Models
SSM
Transformer
Attention Mechanism
Long Context
Efficient Training
FlashAttention
Sequence Modeling
Scaling Laws
Parallelism
GPU Acceleration
2405.21060v1