MENU

Fun & Interesting

TIME SERIES CLUSTERING | HDBSCAN for Clustering 811 Products Sales

Timely Time Series 549 7 months ago
Video Not Working? Fix It Now

In this video, we are going to learn HDBSCAN, which is a density-based algorithm for clustering. Then, we will apply it to find clusters of weekly sales transactions. HDBSCAN can be used with any distance metric, but we will use two only: Euclidean and Dynamic Time Warping (DTW). We will see how the clustering results differ between the distance formulas. Source code: https://www.kaggle.com/code/leesstephanie/sales-clustering-with-hdbscan/notebook (real data set) https://www.kaggle.com/leesstephanie/hdbscan-for-time-series-clustering (synthetic data set) https://github.com/stephanielees/HDBSCAN_WeeklySales More explanation about linkage: https://youtube.com/clip/Ugkx1r3GK144oS4SAi-2L2f2INBNc8D9z39_?si=GXSrJF_7bVQ_X4t9 00:00 Intro 01:16 The intuition of HDBSCAN 01:53 Preparing for going through HDBSCAN algorithm 05:58 Core distance 07:18 Mutual reachability distance 10:38 Minimum Spanning Tree 11:26 Single Linkage Tree, Condensed Tree 20:39 Cluster selection Application with Python: 24:09 Load data 26:51 Visualization 28:39 Apply HDBSCAN with Euclidean distance 33:35 Apply HDBSCAN with DTW distance 37:07 Discussion #timeseries #clustering #machinelearning #retailsales #sales #datascience #pythonprogramming #timeseriesclustering

Comment