![[논문 리뷰] Attention Sinks and Compression Valleys in LLMs are Two Sides of the Same Coin](https://cdn-uploads.huggingface.co/production/uploads/6317233cc92fd6fee317e030/yNP71PjobVvLDgJ0R0qV2.png)
[논문 리뷰] Attention Sinks and Compression Valleys in LLMs are Two Sides of the Same Coin
논문 링크 Massive Activations가 만든 하나의 서사: Attention Sink 와 Compression Valley 를 잇다 TL;DR 잔차 스트림의 massive activations(특히 BOS) …
34 분
2510.06477v1
Massive Activations
Attention Sink
Compression Valley
Residual Stream
Representation Geometry
Anisotropy
Information Bottleneck
Mix–Compress–Refine
MLP Ablation
Layerwise Analysis
LLaMA3
Qwen2
Pythia
LogitLens
TunedLens
LLM Internals
Activation Dynamics
Paper Review