Including a neural net in the shrubland ensemble. 2022-01-15
Last iteration of these models: 2022-01-12
Threshold values were chosen using the validation set (below) to optimize for a certain level of specificity.
Probability Threshold | Specificity | Sensitivity | |
---|---|---|---|
Optimize Both | |||
Linear Ensemble | 0.493 | 0.786 | 0.828 |
Neural Net | 0.441 | 0.748 | 0.847 |
LGB | 0.484 | 0.755 | 0.842 |
RF | 0.517 | 0.746 | 0.752 |
90% Specificity | |||
Linear Ensemble | 0.759 | 0.909 | 0.637 |
Neural Net | 0.754 | 0.903 | 0.597 |
LGB | 0.730 | 0.910 | 0.622 |
RF | 0.604 | 0.907 | 0.444 |
95% Specificity | |||
Linear Ensemble | 0.854 | 0.960 | 0.441 |
Neural Net | 0.875 | 0.949 | 0.382 |
LGB | 0.832 | 0.961 | 0.434 |
RF | 0.637 | 0.950 | 0.331 |
97.5% Specificity | |||
Linear Ensemble | 0.891 | 0.983 | 0.277 |
Neural Net | 0.935 | 0.977 | 0.247 |
LGB | 0.882 | 0.984 | 0.298 |
RF | 0.666 | 0.980 | 0.219 |
99% Specificity | |||
Linear Ensemble | 0.910 | 0.992 | 0.138 |
Neural Net | 0.973 | 0.993 | 0.097 |
LGB | 0.922 | 0.992 | 0.173 |
RF | 0.691 | 0.993 | 0.110 |
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1315 286
1 357 1376
Accuracy : 0.8071
95% CI : (0.7933, 0.8204)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.6143
Mcnemar's Test P-Value : 0.005771
Sensitivity : 0.8279
Specificity : 0.7865
Pos Pred Value : 0.7940
Neg Pred Value : 0.8214
Prevalence : 0.4985
Detection Rate : 0.4127
Detection Prevalence : 0.5198
Balanced Accuracy : 0.8072
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1251 254
1 421 1408
Accuracy : 0.7975
95% CI : (0.7835, 0.8111)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.5952
Mcnemar's Test P-Value : 0.0000000001666
Sensitivity : 0.8472
Specificity : 0.7482
Pos Pred Value : 0.7698
Neg Pred Value : 0.8312
Prevalence : 0.4985
Detection Rate : 0.4223
Detection Prevalence : 0.5486
Balanced Accuracy : 0.7977
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1262 262
1 410 1400
Accuracy : 0.7984
95% CI : (0.7844, 0.8119)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.597
Mcnemar's Test P-Value : 0.00000001423
Sensitivity : 0.8424
Specificity : 0.7548
Pos Pred Value : 0.7735
Neg Pred Value : 0.8281
Prevalence : 0.4985
Detection Rate : 0.4199
Detection Prevalence : 0.5429
Balanced Accuracy : 0.7986
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1248 412
1 424 1250
Accuracy : 0.7493
95% CI : (0.7342, 0.7639)
No Information Rate : 0.5015
P-Value [Acc > NIR] : <2e-16
Kappa : 0.4985
Mcnemar's Test P-Value : 0.7036
Sensitivity : 0.7521
Specificity : 0.7464
Pos Pred Value : 0.7467
Neg Pred Value : 0.7518
Prevalence : 0.4985
Detection Rate : 0.3749
Detection Prevalence : 0.5021
Balanced Accuracy : 0.7493
'Positive' Class : 1
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1520 603
1 152 1059
Accuracy : 0.7735
95% CI : (0.759, 0.7877)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.5467
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.6372
Specificity : 0.9091
Pos Pred Value : 0.8745
Neg Pred Value : 0.7160
Prevalence : 0.4985
Detection Rate : 0.3176
Detection Prevalence : 0.3632
Balanced Accuracy : 0.7731
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1509 670
1 163 992
Accuracy : 0.7501
95% CI : (0.7351, 0.7648)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.4998
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.5969
Specificity : 0.9025
Pos Pred Value : 0.8589
Neg Pred Value : 0.6925
Prevalence : 0.4985
Detection Rate : 0.2975
Detection Prevalence : 0.3464
Balanced Accuracy : 0.7497
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1521 628
1 151 1034
Accuracy : 0.7663
95% CI : (0.7516, 0.7806)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.5323
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.6221
Specificity : 0.9097
Pos Pred Value : 0.8726
Neg Pred Value : 0.7078
Prevalence : 0.4985
Detection Rate : 0.3101
Detection Prevalence : 0.3554
Balanced Accuracy : 0.7659
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1516 924
1 156 738
Accuracy : 0.6761
95% CI : (0.6599, 0.6919)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.3512
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.4440
Specificity : 0.9067
Pos Pred Value : 0.8255
Neg Pred Value : 0.6213
Prevalence : 0.4985
Detection Rate : 0.2214
Detection Prevalence : 0.2681
Balanced Accuracy : 0.6754
'Positive' Class : 1
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1605 929
1 67 733
Accuracy : 0.7013
95% CI : (0.6854, 0.7168)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.4016
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.4410
Specificity : 0.9599
Pos Pred Value : 0.9162
Neg Pred Value : 0.6334
Prevalence : 0.4985
Detection Rate : 0.2199
Detection Prevalence : 0.2400
Balanced Accuracy : 0.7005
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1587 1027
1 85 635
Accuracy : 0.6665
95% CI : (0.6502, 0.6825)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.3318
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.3821
Specificity : 0.9492
Pos Pred Value : 0.8819
Neg Pred Value : 0.6071
Prevalence : 0.4985
Detection Rate : 0.1905
Detection Prevalence : 0.2160
Balanced Accuracy : 0.6656
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1607 940
1 65 722
Accuracy : 0.6986
95% CI : (0.6827, 0.7141)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.3962
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.4344
Specificity : 0.9611
Pos Pred Value : 0.9174
Neg Pred Value : 0.6309
Prevalence : 0.4985
Detection Rate : 0.2166
Detection Prevalence : 0.2361
Balanced Accuracy : 0.6978
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1589 1112
1 83 550
Accuracy : 0.6416
95% CI : (0.625, 0.6579)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.2818
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.3309
Specificity : 0.9504
Pos Pred Value : 0.8689
Neg Pred Value : 0.5883
Prevalence : 0.4985
Detection Rate : 0.1650
Detection Prevalence : 0.1899
Balanced Accuracy : 0.6406
'Positive' Class : 1
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1644 1202
1 28 460
Accuracy : 0.6311
95% CI : (0.6144, 0.6475)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.2606
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.2768
Specificity : 0.9833
Pos Pred Value : 0.9426
Neg Pred Value : 0.5777
Prevalence : 0.4985
Detection Rate : 0.1380
Detection Prevalence : 0.1464
Balanced Accuracy : 0.6300
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1633 1251
1 39 411
Accuracy : 0.6131
95% CI : (0.5963, 0.6297)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.2245
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.2473
Specificity : 0.9767
Pos Pred Value : 0.9133
Neg Pred Value : 0.5662
Prevalence : 0.4985
Detection Rate : 0.1233
Detection Prevalence : 0.1350
Balanced Accuracy : 0.6120
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1645 1166
1 27 496
Accuracy : 0.6422
95% CI : (0.6256, 0.6585)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.2829
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.2984
Specificity : 0.9839
Pos Pred Value : 0.9484
Neg Pred Value : 0.5852
Prevalence : 0.4985
Detection Rate : 0.1488
Detection Prevalence : 0.1569
Balanced Accuracy : 0.6411
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1638 1298
1 34 364
Accuracy : 0.6005
95% CI : (0.5836, 0.6172)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.1991
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.2190
Specificity : 0.9797
Pos Pred Value : 0.9146
Neg Pred Value : 0.5579
Prevalence : 0.4985
Detection Rate : 0.1092
Detection Prevalence : 0.1194
Balanced Accuracy : 0.5993
'Positive' Class : 1
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1659 1433
1 13 229
Accuracy : 0.5663
95% CI : (0.5493, 0.5832)
No Information Rate : 0.5015
P-Value [Acc > NIR] : 0.00000000000003841
Kappa : 0.1303
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.13779
Specificity : 0.99222
Pos Pred Value : 0.94628
Neg Pred Value : 0.53655
Prevalence : 0.49850
Detection Rate : 0.06869
Detection Prevalence : 0.07259
Balanced Accuracy : 0.56501
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1661 1500
1 11 162
Accuracy : 0.5468
95% CI : (0.5297, 0.5638)
No Information Rate : 0.5015
P-Value [Acc > NIR] : 0.000000091
Kappa : 0.0911
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.09747
Specificity : 0.99342
Pos Pred Value : 0.93642
Neg Pred Value : 0.52547
Prevalence : 0.49850
Detection Rate : 0.04859
Detection Prevalence : 0.05189
Balanced Accuracy : 0.54545
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1658 1374
1 14 288
Accuracy : 0.5837
95% CI : (0.5667, 0.6005)
No Information Rate : 0.5015
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.1653
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.17329
Specificity : 0.99163
Pos Pred Value : 0.95364
Neg Pred Value : 0.54683
Prevalence : 0.49850
Detection Rate : 0.08638
Detection Prevalence : 0.09058
Balanced Accuracy : 0.58246
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1660 1479
1 12 183
Accuracy : 0.5528
95% CI : (0.5357, 0.5698)
No Information Rate : 0.5015
P-Value [Acc > NIR] : 0.000000001697
Kappa : 0.1032
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.11011
Specificity : 0.99282
Pos Pred Value : 0.93846
Neg Pred Value : 0.52883
Prevalence : 0.49850
Detection Rate : 0.05489
Detection Prevalence : 0.05849
Balanced Accuracy : 0.55147
'Positive' Class : 1
Probability Threshold | Specificity | Sensitivity | |
---|---|---|---|
Optimize Both | |||
Linear Ensemble | 0.493 | 0.782 | 0.824 |
Neural Net | 0.441 | 0.732 | 0.837 |
LGB | 0.484 | 0.746 | 0.836 |
RF | 0.517 | 0.723 | 0.753 |
90% Specificity | |||
Linear Ensemble | 0.759 | 0.900 | 0.612 |
Neural Net | 0.754 | 0.900 | 0.574 |
LGB | 0.730 | 0.900 | 0.604 |
RF | 0.604 | 0.900 | 0.449 |
95% Specificity | |||
Linear Ensemble | 0.854 | 0.950 | 0.417 |
Neural Net | 0.875 | 0.950 | 0.384 |
LGB | 0.832 | 0.950 | 0.417 |
RF | 0.637 | 0.950 | 0.333 |
97.5% Specificity | |||
Linear Ensemble | 0.891 | 0.975 | 0.267 |
Neural Net | 0.935 | 0.975 | 0.245 |
LGB | 0.882 | 0.975 | 0.301 |
RF | 0.666 | 0.975 | 0.216 |
99% Specificity | |||
Linear Ensemble | 0.910 | 0.990 | 0.155 |
Neural Net | 0.973 | 0.990 | 0.095 |
LGB | 0.922 | 0.990 | 0.186 |
RF | 0.691 | 0.990 | 0.112 |
Call: glm(formula = shrub ~ ., family = "binomial", data = validation)
Coefficients:
(Intercept) tcb tcw tcg nbr
-3.968665633 0.000235535 0.000145283 0.000065578 -0.000321203
mag yod nys_precip nys_tmax nys_tmin
-0.000839042 0.000104765 0.000005128 0.082880178 -0.117791981
nys_aspect nys_dem nys_slope nys_twi lcsec_X2
0.000106479 -0.000528244 -0.001397095 -0.058290832 -0.093505410
lcsec_X3 lcsec_X4 lcsec_X5 lcsec_X8 lcsec_X6
0.217207780 0.140594098 0.195246572 0.531440060 11.187723895
lgb rf nnet
3.723719156 -0.317407053 2.203063234
Degrees of Freedom: 3332 Total (i.e. Null); 3310 Residual
Null Deviance: 4621
Residual Deviance: 2928 AIC: 2974
Model
Model: "sequential"
______________________________________________________________________
Layer (type) Output Shape Param #
======================================================================
dense_features (DenseFeatures multiple 0
)
dense_5 (Dense) multiple 5120
dense_4 (Dense) multiple 32896
dense_3 (Dense) multiple 8256
dense_2 (Dense) multiple 2080
dense_1 (Dense) multiple 528
dropout (Dropout) multiple 0
dense (Dense) multiple 17
======================================================================
Total params: 48,897
Trainable params: 48,897
Non-trainable params: 0
______________________________________________________________________
$num.trees
[1] 3000
$mtry
[1] 1
$min.node.size
[1] 6
$replace
[1] TRUE
$sample.fraction
[1] 0.2
$formula
shrub ~ .
$params
$params$learning_rate
[1] 0.01
$params$nrounds
[1] 2500
$params$num_leaves
[1] 14
$params$max_depth
[1] -1
$params$extra_trees
[1] FALSE
$params$min_data_in_leaf
[1] 10
$params$bagging_fraction
[1] 0.5
$params$bagging_freq
[1] 1
$params$feature_fraction
[1] 0.9
$params$min_data_in_bin
[1] 3
$params$lambda_l1
[1] 0
$params$lambda_l2
[1] 0.5
$params$force_col_wise
[1] TRUE
If you see mistakes or want to suggest changes, please create an issue on the source repository.
For attribution, please cite this work as
Mahoney (2022, Jan. 15). CAFRI Labs: Shrubland 1.0: The Gang's All Here. Retrieved from https://cafri-labs.github.io/acceptable-growing-stock/posts/shrubland-10-the-gangs-all-here/
BibTeX citation
@misc{mahoney2022shrubland, author = {Mahoney, Mike}, title = {CAFRI Labs: Shrubland 1.0: The Gang's All Here}, url = {https://cafri-labs.github.io/acceptable-growing-stock/posts/shrubland-10-the-gangs-all-here/}, year = {2022} }