![[paper review] SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-bit Training](https://cdn-uploads.huggingface.co/production/uploads/66c0a08bac74db25de8427ec/Tb20E3IJSV6PjcD9Nkvfg.png)
[paper review] SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-bit Training
[Paper Review] SageAttention 3 & SageBwd — FP4-Powered Inference and 8-bit Training Paper link: https://arxiv.org/abs/2505.11594v1 📝 TL;DR The SageAttention …
23 minute