MENU

Fun & Interesting

Introduction to Cluster Analysis with R - an Example

Dr. Bharatendra Rai 206,159 9 years ago
Video Not Working? Fix It Now

Provides illustration of doing cluster analysis with R. R code: https://github.com/bkrai/Top-10-Machine-Learning-Methods-With-R Data file link and more on cluster analysis: https://youtu.be/otjWCaMcVaA Cluster analysis is a statistical technique used to group similar objects or data points based on their characteristics. The goal is to identify patterns or structures within data without any prior knowledge of the groups. By measuring the similarity or distance between objects, cluster analysis divides the data into distinct clusters where members of each cluster are more similar to one another than to members of other clusters. This method is widely used in various fields such as marketing for customer segmentation, biology for classifying species, and machine learning for exploratory data analysis. For citation as reference in a research paper, use: Meshram, A., and Rai, B. (2019). “User-Independent Detection for Freezing of Gait in Parkinson’s Disease Using Random Forest Classification,” International Journal of Big Data and Analytics in Healthcare, Vol. 4, Issue 1, 57-72. Rai BK (2017) “Feature Selection and Predictive Modeling of Housing Data Using Random Forest,” International Journal of Social, Behavioral, Educational, Economic, Business and Industrial Engineering, Vol. 11, No. 4, 880-884. Xiaoling, Lu., Rai, B., Yan, Z., and Li, Y. (2018). “Cluster-based Smartphone Predictive Analytics for Application Usage and Next Location Prediction,” International Journal of Business Intelligence Research, Vol. 9, No. 2, 64-80. Topics 00:00 Read data file 00:45 Scatter plot 02:30 Data normalization 04:27 Calculate Euclidean distance 05:54 Cluster dendrogram with complete linkage 08:20 Cluster dendrogram with average linkage 08:52 Cluster membership 10:47 Cluster means 12:35 Silhouette plot 13:31 Scree plot 14:47 Non-hierarchical k-means clustering & interpretation Cluster analysis is an important tool related to analyzing big data or working in data science field. Machine Learning videos: https://goo.gl/WHHqWP Becoming Data Scientist: https://goo.gl/JWyyQc Introductory R Videos: https://goo.gl/NZ55SJ Deep Learning with TensorFlow: https://goo.gl/5VtSuC Image Analysis & Classification: https://goo.gl/Md3fMi Text mining: https://goo.gl/7FJGmd Data Visualization: https://goo.gl/Q7Q2A8 Playlist: https://goo.gl/iwbhnE R is a free software environment for statistical computing and graphics, and is widely used by both academia and industry. R software works on both Windows and Mac-OS. It was ranked no. 1 in a KDnuggets poll on top languages for analytics, data mining, and data science. RStudio is a user friendly environment for R that has become popular.

Comment