As a regular normal SWE, want to share several key topics to better understand Transformer, the architecture that changed the world and laid the foundation of LLMs.
Episode #1: Attention Mechanism https://youtu.be/3RB8WVu9t4Q
Episode #2: Position Encoding https://youtu.be/E1XMcN2lMME
Episode #3: Keys, Values, Queries: https://youtu.be/7i1wlvYLrUo
Episode #4: Multi Head Attention: https://youtu.be/PwSMOwkcl1g
Episode #5: KV Cache and Masked Attention: https://youtu.be/VAtqCJoiOKI
#ai
#technology
#google
#llm
#transformer
#attention
#selfattention
#multihead
#kvcache
#maskedattention
#deepseek