This lecture covers the basics of generic time series prediction (including highlighting latent state versus direct prediction approaches), attention and self attention, and the Transformer architecture.