Paper here: https://arxiv.org/abs/2412.09871
Code: https://github.com/facebookresearch/blt
Notes:
https://drive.google.com/file/d/1B5BdO9FtmxTJiWwVJ3Wa-v3pqaRdbWMh/view?usp=drive_link
https://drive.google.com/file/d/1BBYwr5botkuvI8CkjarIFNiN6B7uliWr/view?usp=drive_link
00:00 Intro
1:15 Current tokenization strategies
02:48 Methodology
8:08 Patching strategy
15:28 N-gram informed byte encodings
22:09 Encoder, global transformer, decoder
36:59 Inference
39:40 Some notes and results