The recording from UseR Oslo's meetup 18th June, 2020, https://www.meetup.com/Oslo-useR-Group/events/270883033/
Description:
The machine learning package caret (Classification And REgression Training) contains functions to streamline the model training process for complex regression and classification problems. There are many different modeling functions in R, and the various algorithms have different syntax, different parameters to tune and different requirements on the data format. Caret provides a uniform interface for more than 230 different machine learning models without loading them all at start. In addition, it standardizes various other tasks such as data splitting, pre-processing, feature selection, variable importance estimation, resampling, model comparison, parallel processing, and visualization, etc. It may take some time to get up to speed with caret, but hopefully this presentation will ease the transition. Caret is all you need to know to solve almost any supervised machine learning problem!
The speaker, Silje Nord is currently working as a Data Scientist at Amesto Nextbridge, and has over 15 years of experience in large scale data analysis. She has a background as cancer researcher, where she headed several successful research projects, amongst other focusing on method development and the search for pan cancer similarities in multidimensional data.
To join one of our live meetups, check out https://www.meetup.com/Oslo-useR-Group.
Timestamps:
00:00 Introduction
01:29 Why Caret?
04:54 Pre-processing, feature selection and feature importance
09:20 Normalization of predictors
13:42 Data splitting/partitioning
Training & testing functions
15:54 - train()
19:15 - trainControl()
22:33 - tuneGrid()
24:29 - predict()
Model comparison
25:25 - confusionMatrix
26:57 - varImp()
Other functionality
27:50 - Ensemble modeling
32:23 - Alternative packages
32:45 - Future reading