Posts

Here are all published articles, sorted by date in descending order.

16 posts total
3 pages total
[Paper Review] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

[Paper Review] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper Link Structured State Space Duality: Unifying SSMs and Attention with Mamba-2 for 2–8× Acceleration TL;DR Structured State-Space …

18 minute
Mamba Mamba-2 Structured State Space Duality SSD State Space Models SSM Transformer Attention Mechanism Long Context Efficient Training FlashAttention Sequence Modeling Scaling Laws Parallelism GPU Acceleration 2405.21060v1

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut