Rotary Positional Embeddings: Combining Absolute and Relative

Efficient NLP 51,180 lượt xem 1 year ago

Video Not Working? Fix It Now

Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io

In this video, I explain RoPE - Rotary Positional Embeddings. Proposed in 2022, this innovation is swiftly making its way into prominent language models like Google's PaLM and Meta's LLaMa. I unpack the magic behind rotary embeddings and reveal how they combine the strengths of both absolute and relative positional encodings.

0:00 - Introduction
1:22 - Absolute positional embeddings
3:19 - Relative positional embeddings
5:51 - Rotary positional embeddings
7:56 - Matrix formulation
9:31 - Implementation
10:38 - Experiments and conclusion

References:

RoFormer: Enhanced Transformer with Rotary Position Embedding (main paper that proposes RoPE embeddings): https://arxiv.org/abs/2104.09864

EleutherAI blog post: https://blog.eleuther.ai/rotary-embeddings/

Blog posts by first author Jianlin Su (in Chinese): https://kexue.fm/archives/8130 and https://kexue.fm/archives/8265

Survey paper on positional embeddings: https://aclanthology.org/2022.cl-3.7/

positional encoding

relative positional encodings

Comment