This video shows a fully reproducible workflow for adapting a compact, open-weights language model so that it can decide which software tool to invoke in response to user requests. Aimed at ML engineers and applied researchers already familiar with the 🤗 Transformers ecosystem, the session delivers a concise, production-oriented example of single-task supervised fine-tuning (SFT).
Full repo here:
https://github.com/samugit83/TheGradientPath/tree/master/LLMFineTuning/SFT_HF_TOOL_CHOICE
🗺️ Tutorial Roadmap
🔄 Synthetic data generation
• Build 10 000 (query, tool) pairs with a helper function—no manual labelling required.
• Mark the tool slot with the control token [my_tool_selection] (no angle brackets needed).
🧹 Dataset preparation with datasets.Dataset
• Assemble prompt / completion records and create deterministic train / validation splits.
📦 Loading the base model
• Pull SmolLM2-135M (135 M parameters) straight from the Hugging Face Hub.
🔧 Tokenizer extension
• Add the new control token and resize the model’s embedding matrix so it becomes learnable.
⚙️ Configuring & running TRL’s SFTTrainer
• Review every key SFTConfig hyper-parameter: epochs, batch size, LR, warm-up, logging, evaluation cadence, checkpoints.
• Monitor training and perform on-the-fly validation with greedy decoding.
📤 Model export & quick functional test
• Save the fine-tuned weights and tokenizer.
• Demonstrate the model selecting weather, calculator, and reminder tools on unseen prompts.
🎯 Key Take-Aways
Schema-aware prompting – how a dedicated token turns a general LLM into a reliable tool router.
Parameter-efficient training – SFT on a 4-bit-quantised model delivers strong results without large GPUs.
Continuous evaluation – in-pipeline testing helps you catch over- or under-fitting before deployment.
🛠️ Prerequisites
Python ≥ 3.10
transformers, datasets, trl, bitsandbytes, accelerate, torch (CUDA or Metal build)
NVIDIA RTX 3060/3050, Apple M-series, or equivalent hardware
#llm #huggingface #finetuning #aiengineering #transformers #python #ai #machinelearning #aiagents