MENU

Fun & Interesting

Round and Round We Go! What makes Rotary Positional Encodings useful?

Gabriel Mongaras 1,369 6 months ago
Video Not Working? Fix It Now

Paper here: https://arxiv.org/abs/2410.06205 Notes: https://drive.google.com/file/d/152NPPyNjo-N6MMIaupXacS41BUJgjE5l/view?usp=drive_link 00:00 Intro 01:09 RoPE: Rotary Positional Embeddings 10:37 Notes on RoPE 12:04 Does RoPE decay with distance? 14:14 How are different frequencies used? 17:02 High frequencies: positional attention 21:29 Low frequencies: semantic attention 28:00 p-RoPE 30:36 Thoughts on this paper

Comment