Mamba is an exciting LLM architecture that, when used with Transformers, might introduce new capabilities we haven't seen before. This video will guide you through the intuitions behind these techniques in an accessible manner.
Timeline
0:00 Introduction
0:36 Overview
1:33 The Disadvantage of Transformers
2:47 States, State Spaces, and State Representations
4:27 State Space Models
7:43 Data Flow in an SSM
9:35 Continuous to a Discrete Signal
11:11 The Recurrent Representation
12:51 The Convolutional Representation
13:48 The Three Representations
15:04 The Importance of the A Matrix
16:29 Mamba
23:40 Outro
Support to my newsletter for more visual guides:
✉️ Newsletter https://newsletter.maartengrootendorst.com/
I wrote a book!
📚 Hands-On Large Language Models https://llm-book.com/
#datascience #machinelearning #ai #llm