This session introduces the course "Fine Tuning for Data Scientists and Software Engineers". It introduces the concept of fine-tuning, and establishes a basic intuition for when it might be applied.
Notes, links and resources are here: https://hamel.quarto.pub/parlance/education/fine_tuning_course/workshop_1.html
This is lesson of 1 of 4 course on applied fine-tuning:
1. When & Why to Fine-Tune: https://youtu.be/cPn0nHFsvFg
2. Fine-Tuning w/Axolotl: https://youtu.be/mmsa4wDsiy0
3. Instrumenting & Evaluating LLMs: https://youtu.be/SnbGD677_u0
4. Deploying Fine-Tuned LLMs: https://youtu.be/GzEcyBykkdo
*00:00 Intro*
Outlines the lesson plan, including course orientation, developing fine-tuning intuition, and understanding when to fine tune.
*05:24 Course Philosophy*
Dan emphasizes the practical nature of the course, focusing on hands-on student interaction with language model tuning. He describes the goal of the course as taking students from a superficial knowledge of fine-tuning to a confident understanding stemming from personal experience.
*12:31 Case Study: Ship Fast*
Dan recounts an experience where after a month of unproductive meetings, three days building a simple prototype unlocked an iterative feedback cycle which allowed much faster development.
*14:47 Eval Driven Workflow*
Dan hands over to Hamel, who introduces an evaluation driven development workflow which will be expanded on as the course progresses.
*16:05 What Is Fine-Tuning*
Dan shifts the topic from philosophy to a more concrete discussion on when to fine-tune. He starts with a quick theoretical background in how a LLM functions.
*20:12 Fine-Tuning*
Dan introduces a means of fine-tuning LLMs by training on a dataset of Input/Output pairs. These pairs are embedded in a template, which informs the model what form of response is required.
*23:30 Templating*
Dan and Hamel both stress the importance and difficulty of consistent templating. Hamel notes that abstractions used for fine-tuning and inference may build the template for you, which is often where errors are introduced.
*28:20 Is Fine-Tuning Dead?*
Dan brings up recent dialogue about whether fine-tuning is still necessary, and describes excitement surrounding fine-tuning as cyclical. Hamel proposes starting with prompt engineering until you prove to yourself that fine-tuning is necessary.
*32:41 Case Study: Shortcomings of Fine-Tuning*
Dan recounts a previous project using a fine-tuned LLM to predict package value for a logistics company. He describes the underperformance of the model, stemming from poor data quality and the inappropriateness of the fine-tuning loss function for regression tasks.
*39:00 Case Study: Honeycomb - NL to Query*
Hamel introduces a case study where a LLM was used to generate domain specific structured queries for a telemetry platform. He describes the initial approach which combined RAG, a syntax manual, few-shot examples, and edge case handling guides into one long prompt context.
*51:06 Q&A Session 1*
Q&A re:RAG vs fine-tuning, fine-tuning for function calls, data requirements for fine-tuning, preference based optimization, multi-modal fine-tuning, training with synthetic data, and which models to fine-tune.
*1:09:14 Breakout Time*
The session splits into breakout rooms to discuss factors which might affect fine-tuning success for a chatbot.
*1:11:23 Case Study: Rechat*
Hamel introduces a case study where a real estate CRM group wished to use a chatbot as an interface to their wide array of tools.
*1:18:48 Case Study: DPD chatbot*
Dan shares an example of a user convincing a commercial chatbot to swear, which garnered media attention.
*1:22:51 Recap: When to Fine-Tune*
Dan lists signs that fine-tuning may be a good option, including; desired bespoke behavior, expected value justifying operational complexity, and access to sufficient input/output training data pairs.
*1:24:08 Preference Optimization*
Dan touches on the limitations of traditional input/output pair training data, and introduces a technique called Direct Preference Optimization, which teaches models a gradient of the quality of a response, allowing it to produce responses better than its training data.
*1:27:00 Case Study: DPO for customer service*
Dan shares a project which used DPO based fine-tuning to generate customer service responses for a publishing company. After training on 200 pairs of better/worse responses to customer queries, the DPO fine-tuned LLM produced responses which managers ranked overall higher quality than those produced by human agents.
*1:29:54 Quiz: Fine-Tuning Use Cases*
The class takes a short quiz on the suitability of fine-tuning for four use cases, then Dan shares his thoughts on the scenarios.
*1:40:22 Q&A Session 2*
Q&A re: model quantization, hallucinated outputs, limitations of instruction tuned models for domain specific tasks, prompt engineering vs fine-tuning, combining supervised fine-tuning and DPO, and data curation.