Data Splitting using Cross Validation and Bootstrap in R

statsguidetree 2,025 lượt xem 3 years ago

Video Not Working? Fix It Now

☕If you would like to support, consider buying me a coffee ☕: https://buymeacoffee.com/statsguide8
For one-on-one tutoring/consultation services: https://guide-tree-statistics-consultation.square.site
I offer one-on-one tutoring/consultation services for many topics related statistics/machine learning. You can also email me statsguidetree@gmail.com
For rcode and dataset: https://gist.github.com/musa5237
This video is a tutorial in R of various data splitting (i.e., model validation, data partitioning) methods with the caret package to estimate accuracy and error. I go over the following methods: test train hold out, leave one out cross validation, k-fold cross validation, repeated k-fold cross validation, and bootstrap 632. The dataset I use is the heart disease dataset. For a review on logistic regression models, please check out the video:
https://www.youtube.com/watch?v=y4FY0KNJ6nk&t=1353s
For formulas used to calculate the metrics provided in the output from the confusion matrix:
https://rdrr.io/cran/caret/man/confusionMatrix.html

crossvalidation

caret

cross-validation

loocv

loo-cv

k-fold-cv

kfold

data validation

test-train

random sampling

with replacement

without replacement

boot strap

bootstrap .632

statsguidetree

model accuracy

model error

logistic regression

model validation

Comment