Creating a Custom Composer Callback to Track Data Types in LLM Training | Mixed Precision Deep Dive

vishal 87 4 weeks ago

Video Not Working? Fix It Now

In this video, I walk through the process of creating a custom Composer callback for LLM-Foundry to log data types of various entities (weights, gradients, activations, optimizer states, and loss) during mixed precision training. I explain: 1. How to integrate custom callbacks into the LLM-Foundry training pipeline 2. The technical challenges of capturing activation inputs in self-attention layers 3. How to use PyTorch's register_forward_hook and monkey patching 4. A deep dive into Python descriptors and the `get` method 5. Analysis of data type flow in both FP32 and BF16 mixed precision training The code and detailed analysis are available in my blog post: https://vishalbakshi.github.io/blog/posts/2025-04-02-Composer-Callback-Logging-dtypes/ This is part of my ongoing exploration of LLM training internals. If you're interested in understanding what happens during mixed precision training, subscribe for more deep dives into LLM training mechanics! #DeepLearning #LLM #PyTorch #MixedPrecision #LLMFoundry #MachineLearning #AITraining

Comment