Lecture 20 - Transformers and Attention

Deep Learning Systems Course 9,712 lượt xem 2 years ago

Video Not Working? Fix It Now

This lecture covers the basics of generic time series prediction (including highlighting latent state versus direct prediction approaches), attention and self attention, and the Transformer architecture.

Comment