MENU

Fun & Interesting

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Gabriel Mongaras 1,663 lượt xem 5 months ago
Video Not Working? Fix It Now

Paper here: https://arxiv.org/abs/2410.23168
Code: https://github.com/haiyang-w/tokenformer

Notes: https://drive.google.com/file/d/17PsGwefQJoSQxBHykoSFeMrKZhPDFx-E/view?usp=sharing

00:00 Intro
02:48 Methodology
7:54 This is an MLP
10:18 How they change the transformer
16:00 Model scaling
20:48 Results

Comment