Get exclusive access to AI resources and project ideas: https://the-data-entrepreneurs.kit.com/shaw
In this video, I walk through how to fine-tune BERT (110M parameters) to classify phishing URLs. I first cover key concepts and then share example Python code.
Resources:
📰 Blog: https://medium.com/towards-data-science/fine-tuning-bert-for-text-classification-a01f89b179fc?sk=1fdd9847fcfa70c772f317d3eeeaec07
🎥 Fine-tuning LLMs: https://youtu.be/eC6Hd1hFvos
🎥 Compressing LLMs: https://youtu.be/FLkUOkeMd5M
💻 GitHub Repo: https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/model-compression
🤗 Model: https://huggingface.co/shawhin/bert-phishing-classifier_teacher
💿 Dataset: https://huggingface.co/datasets/shawhin/phishing-site-classification
References:
[1] https://arxiv.org/abs/1810.04805
[2] https://arxiv.org/abs/1511.01432
[3] https://arxiv.org/abs/1801.06146
--
Homepage: https://www.shawhintalebi.com
Intro - 0:00
Fine-tuning - 0:55
BERT - 5:09
Text Classification - 8:39
Example (Motivation) - 9:41
Example Code: Fine-tuning BERT on Phishing URLs - 10:55