The main content is here:
https://asecuritysite.com/cyberdata/ch13
Introduction to Splunk: https://youtu.be/bOQmd6B8jGo
Splunk and ML Part 1: https://www.youtube.com/watch?v=MYg1dhp1rzo
Splunk and ML Part 2: https://www.youtube.com/watch?v=H-qkbIH0v-c
== Anomaly Detection
| inputlookup iris.csv
| fit LocalOutlierFactor petal_length petal_width n_neighbors=10 algorithm=kd_tree metric=minkowski p=1 contamination=0.14 leaf_size=10 Link.
| inputlookup iris.csv
| fit OneClassSVM * kernel="poly" nu=0.5 coef0=0.5 gamma=0.5 tol=1 degree=3 shrinking=f into TESTMODEL_OneClassSVM
| inputlookup call_center.csv
| fit DensityFunction count by "source" into mymodel
==Prediction
| inputlookup iris.csv
| fit AutoPrediction random_state=42 petal_length from * max_features=0.1 into auto_classify_model test_split_ratio=0.3 random_state=42
| inputlookup iris.csv
| fit BernoulliNB petal_length from * into TESTMODEL_BernoulliNB alpha=0.5 binarize=0 fit_prior=f
| inputlookup iris.csv
| fit DecisionTreeClassifier petal_length from * into sla_ MOD
| inputlookup iris.csv
| fit GaussianNB petal_length from * into MOD
| inputlookup iris.csv
| fit LogisticRegression petal_length from * into MOD
| inputlookup iris.csv
| fit MLPClassifier petal_length from * into MOD
| inputlookup iris.csv
| fit RandomForestClassifier petal_length from * into MOD
| inputlookup iris.csv
| fit SGDClassifier petal_length from * into MOD
| inputlookup iris.csv
| fit SVM petal_length from * into MOD.
| inputlookup iris.csv
| fit GradientBoostingClassifier petal_length from * into MOD
== Prediction (Numeric)
| inputlookup track_day_missing.csv
| fit AutoPrediction batteryVoltage target_type=numeric test_split_ratio=0.7 from * into PM
| inputlookup track_day_missing.csv
| fit DecisionTreeRegressor batteryVoltage from * into PM
| inputlookup track_day_missing.csv
| fit ElasticNet batteryVoltage from * into EN
| inputlookup track_day_missing.csv
| fit GradientBoostingRegressor batteryVoltage from * into GB
| inputlookup track_day_missing.csv
| fit KernelRidge batteryVoltage from * into KR
| inputlookup track_day_missing.csv
| fit Lasso batteryVoltage from * into LA
| inputlookup track_day_missing.csv
| fit LinearRegression batteryVoltage from * into LR
| inputlookup track_day_missing.csv
| fit RandomForestRegressor batteryVoltage min_samples_split=30000 from * into RF
| inputlookup track_day_missing.csv
| fit Ridge batteryVoltage from * into RD
| inputlookup track_day_missing.csv
| fit SGDRegressor batteryVoltage from * into SG
| inputlookup app_usage.csv
| fit SystemIdentification Expenses from HR1 HR2 ERP dynamics=3-2-2-3 layers=64-64-64
== Cluster
| inputlookup iris.csv
| fit Birch petal_length k=3 partial_fit=true into MOD
| inputlookup iris.csv
| fit DBSCAN petal_length min_samples=4
| inputlookup iris.csv
| fit GMeans petal_length random_state=42 into MOD3
[based on k-means]
| inputlookup iris.csv
| fit KMeans petal_length k=3 into MOD4
| inputlookup iris.csv
| fit SpectralClustering petal_length k=3
| inputlookup iris.csv
| fit XMeans petal_length
== Feature Extraction
| inputlookup track_day.csv
| fit FieldSelector batteryVoltage from engineCoolantTemperature, engineSpeed, lateralGForce,longitudeGForce, speed type=numeric
| inputlookup track_day.csv
| fit FieldSelector vehicleType from engineCoolantTemperature, engineSpeed, lateralGForce,longitudeGForce, speed type=categorical
| inputlookup passwords.csv
| fit HashingVectorizer Passwords ngram_range=1-2 k=10
| inputlookup track_day.csv
| fit ICA batteryVoltage, engineSpeed n_components=2 as IC
| inputlookup track_day.csv
| fit KernelPCA batteryVoltage, engineSpeed k=3 gamma=0.001 as PCA
| inputlookup track_day.csv
| fit NPR vehicleType from engineSpeed as npr01
| inputlookup track_day.csv
| fit PCA engineCoolantTemperature, engineSpeed, lateralGForce,speed k=3 as pca01
| inputlookup track_day.csv
| fit TFIDF vehicleType ngram_range=1-2 max_df=0.6 min_df=0.2 stop_words=english as tf01
== Preprocessing
| inputlookup track_day_missing.csv
| fit Imputer batteryVoltage
| inputlookup track_day_missing.csv
| fit RobustScaler *
| inputlookup track_day_missing.csv
| fit StandardScaler *