When and Why to Fine Tune an LLM

Hamel Husain 5,055 lượt xem 9 months ago

Video Not Working? Fix It Now

This session introduces the course "Fine Tuning for Data Scientists and Software Engineers". It introduces the concept of fine-tuning, and establishes a basic intuition for when it might be applied.

Notes, links and resources are here: https://hamel.quarto.pub/parlance/education/fine_tuning_course/workshop_1.html

This is lesson of 1 of 4 course on applied fine-tuning:

1. When & Why to Fine-Tune: https://youtu.be/cPn0nHFsvFg
2. Fine-Tuning w/Axolotl: https://youtu.be/mmsa4wDsiy0
3. Instrumenting & Evaluating LLMs: https://youtu.be/SnbGD677_u0
4. Deploying Fine-Tuned LLMs: https://youtu.be/GzEcyBykkdo

00:00 Intro
Outlines the lesson plan, including course orientation, developing fine-tuning intuition, and understanding when to fine tune.

05:24 Course Philosophy
Dan emphasizes the practical nature of the course, focusing on hands-on student interaction with language model tuning. He describes the goal of the course as taking students from a superficial knowledge of fine-tuning to a confident understanding stemming from personal experience.

12:31 Case Study: Ship Fast
Dan recounts an experience where after a month of unproductive meetings, three days building a simple prototype unlocked an iterative feedback cycle which allowed much faster development.

14:47 Eval Driven Workflow
Dan hands over to Hamel, who introduces an evaluation driven development workflow which will be expanded on as the course progresses.

16:05 What Is Fine-Tuning
Dan shifts the topic from philosophy to a more concrete discussion on when to fine-tune. He starts with a quick theoretical background in how a LLM functions.

20:12 Fine-Tuning
Dan introduces a means of fine-tuning LLMs by training on a dataset of Input/Output pairs. These pairs are embedded in a template, which informs the model what form of response is required.

23:30 Templating
Dan and Hamel both stress the importance and difficulty of consistent templating. Hamel notes that abstractions used for fine-tuning and inference may build the template for you, which is often where errors are introduced.

28:20 Is Fine-Tuning Dead?
Dan brings up recent dialogue about whether fine-tuning is still necessary, and describes excitement surrounding fine-tuning as cyclical. Hamel proposes starting with prompt engineering until you prove to yourself that fine-tuning is necessary.

32:41 Case Study: Shortcomings of Fine-Tuning
Dan recounts a previous project using a fine-tuned LLM to predict package value for a logistics company. He describes the underperformance of the model, stemming from poor data quality and the inappropriateness of the fine-tuning loss function for regression tasks.

39:00 Case Study: Honeycomb - NL to Query
Hamel introduces a case study where a LLM was used to generate domain specific structured queries for a telemetry platform. He describes the initial approach which combined RAG, a syntax manual, few-shot examples, and edge case handling guides into one long prompt context.

51:06 Q&A Session 1
Q&A re:RAG vs fine-tuning, fine-tuning for function calls, data requirements for fine-tuning, preference based optimization, multi-modal fine-tuning, training with synthetic data, and which models to fine-tune.

1:09:14 Breakout Time
The session splits into breakout rooms to discuss factors which might affect fine-tuning success for a chatbot.

1:11:23 Case Study: Rechat
Hamel introduces a case study where a real estate CRM group wished to use a chatbot as an interface to their wide array of tools.

1:18:48 Case Study: DPD chatbot
Dan shares an example of a user convincing a commercial chatbot to swear, which garnered media attention.

1:22:51 Recap: When to Fine-Tune
Dan lists signs that fine-tuning may be a good option, including; desired bespoke behavior, expected value justifying operational complexity, and access to sufficient input/output training data pairs.

1:24:08 Preference Optimization
Dan touches on the limitations of traditional input/output pair training data, and introduces a technique called Direct Preference Optimization, which teaches models a gradient of the quality of a response, allowing it to produce responses better than its training data.

1:27:00 Case Study: DPO for customer service
Dan shares a project which used DPO based fine-tuning to generate customer service responses for a publishing company. After training on 200 pairs of better/worse responses to customer queries, the DPO fine-tuned LLM produced responses which managers ranked overall higher quality than those produced by human agents.

1:29:54 Quiz: Fine-Tuning Use Cases
The class takes a short quiz on the suitability of fine-tuning for four use cases, then Dan shares his thoughts on the scenarios.

1:40:22 Q&A Session 2
Q&A re: model quantization, hallucinated outputs, limitations of instruction tuned models for domain specific tasks, prompt engineering vs fine-tuning, combining supervised fine-tuning and DPO, and data curation.

LLMs

Applied-llms

mastering llms

rag

fine tuning

prompt engineering

building applications

evals

parlance labs

developers

data science

Retrieval Augmented Generation

Comment