Florence-2: Fine-tune Microsoft’s Multimodal Model

Roboflow 23,293 10 months ago

Video Not Working? Fix It Now

Learn how to fine-tune Microsoft's Florence-2, a powerful open-source Vision Language Model, for custom object detection tasks. This in-depth tutorial guides you through setting up your environment in Google Colab, preparing datasets, and optimizing the model using LoRA. Chapters: - 00:00 Introduction: Unlock the Power of Florence-2 - 01:09 Getting Started: Prepare for VLM Fine-Tuning - 03:55 Florence-2 in Action: Explore Pre-trained Capabilities - 07:00 Dataset Deep Dive: PyTorch Data Loading for Florence-2 - 13:02 LoRA: Optimize Your VLM Training - 14:21 Fine-Tuning: Unleash Florence-2's Custom Object Detection - 17:30 Model Evaluation: Measure Your VLM's Success - 21:37 Florence-2 vs Other Computer Vision Models - 24:09 Conclusion and Next Steps Resources: - Roboflow: https://roboflow.com - 🔴 Community Session July 3th, 2024 at 08:00 AM PST / 11:00 AM EST / 05:00 PM CET: https://roboflow.stream - ⭐ Notebooks GitHub: https://github.com/roboflow/notebooks - 📓 Florence notebook: https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/how-to-finetune-florence-2-on-detection-dataset.ipynb Try Florence-2 for free: https://playground.roboflow.com/ - 🗞 Florence-2 arXiv paper: https://arxiv.org/abs/2311.06242 - 🗞 Florence-2 overview blog post: https://blog.roboflow.com/florence-2 - 🗞 Florence-2 fine-tuning blog post: https://blog.roboflow.com/fine-tune-florence-2-object-detection - 🔗 Florence-2 HF Space: https://huggingface.co/spaces/gokaygokay/Florence-2 - 🗞 Mean Average Precision (mAP) blog post: https://blog.roboflow.com/mean-average-precision - 🗞 Confusion Matrix blog post: https://blog.roboflow.com/what-is-a-confusion-matrix Stay updated with the projects I'm working on at https://github.com/roboflow and https://github.com/SkalskiP! ⭐

Comment