MENU

Fun & Interesting

Fine-Tuning with Axolotl

Hamel Husain 3,635 lượt xem 9 months ago
Video Not Working? Fix It Now

This lesson illustrates an end-to-end example of fine-tuning a model using Axolotl to understand a domain-specific query language. Guest speakers include Wing Lian, creator of Axolotl and Zach Mueller lead developer on HuggingFace Accelerate.

Notes, slides, and additional resources: https://parlance-labs.com/education/fine_tuning_course/workshop_2.html

This is lesson of 2 of 4 course on applied fine-tuning:

1. When & Why to Fine-Tune: https://youtu.be/cPn0nHFsvFg
2. Fine-Tuning w/Axolotl: https://youtu.be/mmsa4wDsiy0
3. Instrumenting & Evaluating LLMs: https://youtu.be/SnbGD677_u0
4. Deploying Fine-Tuned LLMs: https://youtu.be/GzEcyBykkdo

Chapter Summaries:

*0:00 Overview*

*0:51 Small vs. Larger LLMs*

*3:47 Model Family*

*5:45 LoRA vs. Fine-tuning*

*9:54 QLoRA*

*14:35 Improving Data vs. Hyperparameters*

*15:47 What is Axolotl*

*21:45 Axolotl Config Files Walkthrough*

*27:23 Finetuning with Axolotl via CLI*

*30:37 Alpaca Dataset Template and Debugging Tools*

*36:06 Gradio App Demo*

*37:14 Honeycomb Case Study*

*39:51 Honeycomb Prompt Notebook*

*43:10 Writing Level 1 Evaluations*

*46:14 Generating Synthetic Data*

*49:45 Data and Config Files for Fine-tuning*

*53:40 Viewing Data After Preprocessing*

*57:31 Training with Axolotl*

*1:00:24 Model Sanity Checks*

*1:02:44 Level 2 Evaluations*

*1:07:17 Curating Data*

*1:11:09 Debugging Axolotl*

*1:13:37 Predicting Fine-tuning Time*

*1:16:34 GPU Memory Usage for Fine-tuning*

*1:18:49 Distributed Training*

*1:20:13 Fully Sharded Data Parallelism (FSDP)*

*1:21:50 Sharding Strategies*

*1:23:37 How to Split the Model*

*1:24:44 Offloading Parameters*

*1:27:43 What is Accelerate*

*1:29:25 Distributing Training with Accelerate*

*1:31:18 Using Accelerate in Code*

*1:33:05 Mixed Precision*

*1:35:40 FSDP vs. Deepspeed*

*1:38:10 FSDP and Deepspeed on Axolotl*

*1:42:07 Training on Modal*

*1:46:21 Using Modal to Fine-tune LLM with Axolotl*

*1:51:55 Inspecting Data with Notebook*

*1:53:00 Q&A Session*

*1:53:33 Determining Adapter Rank and Alpha*

*1:56:25 Custom Evaluation Metrics*

*1:59:29 Features of Lower-Level Libraries*

*2:02:14 4-Bit vs. Higher Precision*

*2:07:54 Making Models Deterministic*

Comment