Massive Activations

All posts under tag "Massive Activations"

1 posts total

Sorted by date

[Paper Review] Massive Activations in Large Language Models

[Paper Review] Massive Activations in Large Language Models

[Paper Review] Massive Activations in Large Language Models

Paper Link Massive Activations, Hidden Biases: A Reinterpretation of Self-Attention’s Secrets TL;DR Just 4–10 extreme scalar values …

2402.17762v2 Transformer SelfAttention BiasMechanism RepresentationLearning Interpretability NeuralMechanisms Massive Activations Explicit Attention Bias