This video covers all kinds of extra optimizations that XGBoost uses when the training dataset is huge. So we'll talk about the Approximate Greedy Algorithm, Parallel Learning, The Weighted Quantile Sketch, Sparsity-Aware Split Finding (i.e. how XGBoost deals with missing data and uses default paths), Cache-Aware Access and Blocks for Out-of-Core Computation. That's a lot of stuff, but we'll go through it step-by-step and it will be a whole lot of fun. :)
NOTE: This StatQuest assumes that you are already familiar with...
XGBoost Part 1: XGBoost Trees for Regression: https://youtu.be/OtD8wVaFm6E
XGBoost Part 2: XGBoost Trees for Classification: https://youtu.be/8b1JEDvenQU
Quantiles and Percentiles: https://youtu.be/IFKQLDmRK0Y
For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join
...buying one of my books, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
https://statquest.org/statquest-store/
...or just donating to StatQuest!
https://www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer
#statquest #xgboost