As a regular normal SWE, want to share several key topics to better understand Transformer, the architecture that changed the world and laid the foundation of LLMs.
Episode #1: Attention Mechanism https://youtu.be/3RB8WVu9t4Q
Episode #2: Position Encoding https://youtu.be/E1XMcN2lMME
Episode #3: Keys, Values, Queries: https://youtu.be/7i1wlvYLrUo
Episode #4: Multi Head Attention: https://youtu.be/PwSMOwkcl1g
Episode #5: KV Cache and Masked Attention: https://youtu.be/VAtqCJoiOKI
Episode #6: Different Attentions: https://youtu.be/YAFojGSW6HM
Episode #7: Feed Forward Network: https://youtu.be/aizh6ipm03k
#ai
#google
#llm
#transformer
#normalization
#math
#technology