![[Paper Review] Massive Activations in Large Language Models](https://eric-mingjie.github.io/massive-activations/assets/main_teaser_final.png)
[Paper Review] Massive Activations in Large Language Models
Paper Link Massive Activations, Hidden Biases: A Reinterpretation of Self-Attention’s Secrets TL;DR Just 4–10 extreme scalar values (×10,000) out of tens of …
20 minute
All posts under tag "BiasMechanism"
Enter keywords to search articles