This in-depth tutorial is about fine-tuning LLMs locally with Huggingface Transformers and Pytorch. We use Meta's new Llama-3.2-1B-Instruct model and teach it to predict paper categories using LORA adapters. Along the way I break down all the major things you must know about fine-tuning, from prompting, creating datasets, generating input-output pairs, loss functions, pytorch optimizers, peft LORA adapters, and ofcourse the sweet feeling when the test accuracy goes up. :)
Visit AI Agent Store Page: https://aiagentstore.ai/?ref=avishek
All the notebooks, datasets, and python code used in this video have been uploaded to my Patreon:
https://www.patreon.com/NeuralBreakdownwithAVB
I upload all the code, slides, animations, write-ups etc for all my videos on my Patreon, so go check it out if you find anything interesting.
Videos you might like:
Llama3.2 Multimodal Application - https://youtu.be/QLUKXvHgOrI
Apple Intelligence LLM Breakdown - https://youtu.be/Sah0dnu8Hxo
50 concepts to know NLP: https://youtu.be/uocYQH0cWTs
Attention to Transformers playlist: https://www.youtube.com/playlist?list=PLGXWtN1HUjPfq0MSqD5dX8V7Gx5ow4QYW
Notes on Hardware and Quantization:
I didn't go over quantization in this video, coz I'm on a Macbook and bitsandbytes don't work outside NVIDIA gpus. :) Hopefully, I'll make a separate video one day about quantization.
The system I am using is a MacBook Pro M2 16GB ram. If you have nvidia gpus, you could leverage better quantization. For my machine, I was able to train with batch size of 8 in float32… the sequence lengths were around 250 on average for this task.
If I were working on a product, I’d rent cloud gpu servers and fine tune over there on large datasets. For a YT video with an educational intent, I decided to limit the scope to local machines.
#ai #deeplearning #machinelearning
0:00 - Intro
2:04 - Huggingface Transformers Basics
4:49 - Tokenizers
8:39 - Instruction Prompts and Chat Templates
12:35 - Dataset creation
15:54 - Next word prediction
20:52 - Loss functions on sequences
28:28 - Complete finetuning with Pytorch
31:38 - LORA Finetuning with PEFT
35:38 - Results