The first iteration of shrubland model reporting. 2022-01-16
Last iteration of these models: 2022-01-15
Threshold values were chosen using the validation set (below) to optimize for a certain level of specificity.
Probability Threshold | Specificity | Sensitivity | |
---|---|---|---|
Optimize Both | |||
Linear Ensemble | 0.141 | 0.779 | 0.846 |
Neural Net | 0.167 | 0.757 | 0.854 |
LGB | 0.206 | 0.777 | 0.832 |
RF | 0.226 | 0.731 | 0.781 |
90% Specificity | |||
Linear Ensemble | 0.332 | 0.896 | 0.683 |
Neural Net | 0.366 | 0.895 | 0.669 |
LGB | 0.370 | 0.898 | 0.655 |
RF | 0.295 | 0.897 | 0.552 |
95% Specificity | |||
Linear Ensemble | 0.523 | 0.948 | 0.529 |
Neural Net | 0.481 | 0.948 | 0.507 |
LGB | 0.490 | 0.947 | 0.500 |
RF | 0.338 | 0.948 | 0.404 |
97.5% Specificity | |||
Linear Ensemble | 0.677 | 0.974 | 0.384 |
Neural Net | 0.579 | 0.973 | 0.372 |
LGB | 0.602 | 0.975 | 0.348 |
RF | 0.375 | 0.975 | 0.272 |
99% Specificity | |||
Linear Ensemble | 0.818 | 0.990 | 0.241 |
Neural Net | 0.709 | 0.990 | 0.226 |
LGB | 0.707 | 0.990 | 0.211 |
RF | 0.409 | 0.990 | 0.146 |
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 20688 1044
1 5880 5722
Accuracy : 0.7923
95% CI : (0.7879, 0.7966)
No Information Rate : 0.797
P-Value [Acc > NIR] : 0.9844
Kappa : 0.493
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.8457
Specificity : 0.7787
Pos Pred Value : 0.4932
Neg Pred Value : 0.9520
Prevalence : 0.2030
Detection Rate : 0.1717
Detection Prevalence : 0.3481
Balanced Accuracy : 0.8122
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 20108 988
1 6460 5778
Accuracy : 0.7766
95% CI : (0.7721, 0.781)
No Information Rate : 0.797
P-Value [Acc > NIR] : 1
Kappa : 0.4694
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.8540
Specificity : 0.7569
Pos Pred Value : 0.4721
Neg Pred Value : 0.9532
Prevalence : 0.2030
Detection Rate : 0.1733
Detection Prevalence : 0.3671
Balanced Accuracy : 0.8054
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 20644 1134
1 5924 5632
Accuracy : 0.7883
95% CI : (0.7838, 0.7926)
No Information Rate : 0.797
P-Value [Acc > NIR] : 1
Kappa : 0.4822
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.8324
Specificity : 0.7770
Pos Pred Value : 0.4874
Neg Pred Value : 0.9479
Prevalence : 0.2030
Detection Rate : 0.1690
Detection Prevalence : 0.3467
Balanced Accuracy : 0.8047
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 19427 1484
1 7141 5282
Accuracy : 0.7413
95% CI : (0.7365, 0.746)
No Information Rate : 0.797
P-Value [Acc > NIR] : 1
Kappa : 0.3903
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.7807
Specificity : 0.7312
Pos Pred Value : 0.4252
Neg Pred Value : 0.9290
Prevalence : 0.2030
Detection Rate : 0.1585
Detection Prevalence : 0.3727
Balanced Accuracy : 0.7559
'Positive' Class : 1
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 23796 2148
1 2772 4618
Accuracy : 0.8524
95% CI : (0.8485, 0.8562)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.559
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.6825
Specificity : 0.8957
Pos Pred Value : 0.6249
Neg Pred Value : 0.9172
Prevalence : 0.2030
Detection Rate : 0.1385
Detection Prevalence : 0.2217
Balanced Accuracy : 0.7891
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 23783 2239
1 2785 4527
Accuracy : 0.8493
95% CI : (0.8454, 0.8531)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.5478
Mcnemar's Test P-Value : 0.00000000000001483
Sensitivity : 0.6691
Specificity : 0.8952
Pos Pred Value : 0.6191
Neg Pred Value : 0.9140
Prevalence : 0.2030
Detection Rate : 0.1358
Detection Prevalence : 0.2194
Balanced Accuracy : 0.7821
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 23851 2331
1 2717 4435
Accuracy : 0.8486
95% CI : (0.8447, 0.8524)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.5417
Mcnemar's Test P-Value : 0.00000006001
Sensitivity : 0.6555
Specificity : 0.8977
Pos Pred Value : 0.6201
Neg Pred Value : 0.9110
Prevalence : 0.2030
Detection Rate : 0.1330
Detection Prevalence : 0.2146
Balanced Accuracy : 0.7766
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 23823 3033
1 2745 3733
Accuracy : 0.8267
95% CI : (0.8226, 0.8307)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.4556
Mcnemar's Test P-Value : 0.0001596
Sensitivity : 0.5517
Specificity : 0.8967
Pos Pred Value : 0.5763
Neg Pred Value : 0.8871
Prevalence : 0.2030
Detection Rate : 0.1120
Detection Prevalence : 0.1943
Balanced Accuracy : 0.7242
'Positive' Class : 1
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 25194 3185
1 1374 3581
Accuracy : 0.8632
95% CI : (0.8595, 0.8669)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.5305
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.5293
Specificity : 0.9483
Pos Pred Value : 0.7227
Neg Pred Value : 0.8878
Prevalence : 0.2030
Detection Rate : 0.1074
Detection Prevalence : 0.1486
Balanced Accuracy : 0.7388
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 25177 3339
1 1391 3427
Accuracy : 0.8581
95% CI : (0.8543, 0.8618)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.5087
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.5065
Specificity : 0.9476
Pos Pred Value : 0.7113
Neg Pred Value : 0.8829
Prevalence : 0.2030
Detection Rate : 0.1028
Detection Prevalence : 0.1445
Balanced Accuracy : 0.7271
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 25173 3381
1 1395 3385
Accuracy : 0.8567
95% CI : (0.8529, 0.8605)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.5028
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.5003
Specificity : 0.9475
Pos Pred Value : 0.7082
Neg Pred Value : 0.8816
Prevalence : 0.2030
Detection Rate : 0.1015
Detection Prevalence : 0.1434
Balanced Accuracy : 0.7239
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 25176 4034
1 1392 2732
Accuracy : 0.8372
95% CI : (0.8332, 0.8412)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.4112
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.40378
Specificity : 0.94761
Pos Pred Value : 0.66246
Neg Pred Value : 0.86190
Prevalence : 0.20298
Detection Rate : 0.08196
Detection Prevalence : 0.12372
Balanced Accuracy : 0.67569
'Positive' Class : 1
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 25884 4169
1 684 2597
Accuracy : 0.8544
95% CI : (0.8506, 0.8582)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.4431
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.38383
Specificity : 0.97425
Pos Pred Value : 0.79153
Neg Pred Value : 0.86128
Prevalence : 0.20298
Detection Rate : 0.07791
Detection Prevalence : 0.09843
Balanced Accuracy : 0.67904
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 25848 4251
1 720 2515
Accuracy : 0.8509
95% CI : (0.847, 0.8547)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.4278
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.37171
Specificity : 0.97290
Pos Pred Value : 0.77743
Neg Pred Value : 0.85877
Prevalence : 0.20298
Detection Rate : 0.07545
Detection Prevalence : 0.09705
Balanced Accuracy : 0.67231
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 25907 4413
1 661 2353
Accuracy : 0.8478
95% CI : (0.8439, 0.8516)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.407
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.34777
Specificity : 0.97512
Pos Pred Value : 0.78069
Neg Pred Value : 0.85445
Prevalence : 0.20298
Detection Rate : 0.07059
Detection Prevalence : 0.09042
Balanced Accuracy : 0.66144
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 25917 4927
1 651 1839
Accuracy : 0.8327
95% CI : (0.8286, 0.8367)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.3235
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.27180
Specificity : 0.97550
Pos Pred Value : 0.73855
Neg Pred Value : 0.84026
Prevalence : 0.20298
Detection Rate : 0.05517
Detection Prevalence : 0.07470
Balanced Accuracy : 0.62365
'Positive' Class : 1
Logistic Ensemble
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 26303 5134
1 265 1632
Accuracy : 0.838
95% CI : (0.834, 0.842)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.316
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.24121
Specificity : 0.99003
Pos Pred Value : 0.86031
Neg Pred Value : 0.83669
Prevalence : 0.20298
Detection Rate : 0.04896
Detection Prevalence : 0.05691
Balanced Accuracy : 0.61562
'Positive' Class : 1
Neural Net
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 26300 5234
1 268 1532
Accuracy : 0.8349
95% CI : (0.8309, 0.8389)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.2978
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.22643
Specificity : 0.98991
Pos Pred Value : 0.85111
Neg Pred Value : 0.83402
Prevalence : 0.20298
Detection Rate : 0.04596
Detection Prevalence : 0.05400
Balanced Accuracy : 0.60817
'Positive' Class : 1
LightGBM
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 26311 5338
1 257 1428
Accuracy : 0.8322
95% CI : (0.8281, 0.8362)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.2796
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.21106
Specificity : 0.99033
Pos Pred Value : 0.84748
Neg Pred Value : 0.83134
Prevalence : 0.20298
Detection Rate : 0.04284
Detection Prevalence : 0.05055
Balanced Accuracy : 0.60069
'Positive' Class : 1
Random Forest
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 26311 5780
1 257 986
Accuracy : 0.8189
95% CI : (0.8147, 0.823)
No Information Rate : 0.797
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.1955
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.14573
Specificity : 0.99033
Pos Pred Value : 0.79324
Neg Pred Value : 0.81989
Prevalence : 0.20298
Detection Rate : 0.02958
Detection Prevalence : 0.03729
Balanced Accuracy : 0.56803
'Positive' Class : 1
Probability Threshold | Specificity | Sensitivity | |
---|---|---|---|
Optimize Both | |||
Linear Ensemble | 0.141 | 0.785 | 0.843 |
Neural Net | 0.167 | 0.760 | 0.856 |
LGB | 0.206 | 0.779 | 0.829 |
RF | 0.226 | 0.730 | 0.776 |
90% Specificity | |||
Linear Ensemble | 0.332 | 0.900 | 0.673 |
Neural Net | 0.366 | 0.900 | 0.665 |
LGB | 0.370 | 0.900 | 0.638 |
RF | 0.295 | 0.900 | 0.536 |
95% Specificity | |||
Linear Ensemble | 0.523 | 0.950 | 0.521 |
Neural Net | 0.481 | 0.950 | 0.507 |
LGB | 0.490 | 0.950 | 0.485 |
RF | 0.338 | 0.950 | 0.398 |
97.5% Specificity | |||
Linear Ensemble | 0.677 | 0.975 | 0.381 |
Neural Net | 0.579 | 0.975 | 0.365 |
LGB | 0.602 | 0.975 | 0.339 |
RF | 0.375 | 0.975 | 0.270 |
99% Specificity | |||
Linear Ensemble | 0.818 | 0.990 | 0.237 |
Neural Net | 0.709 | 0.990 | 0.225 |
LGB | 0.707 | 0.990 | 0.205 |
RF | 0.409 | 0.990 | 0.142 |
Call: glm(formula = shrub ~ ., family = "binomial", data = validation)
Coefficients:
(Intercept) tcb tcw tcg nbr
-3.11032321 0.00027987 0.00073321 -0.00006364 -0.00157420
mag yod nys_precip nys_tmax nys_tmin
0.00036542 -0.00002006 -0.00020836 0.12872106 -0.08899034
nys_aspect nys_dem nys_slope nys_twi lcsec_X2
0.00014450 0.00032340 -0.06009474 -0.13512527 0.09216093
lcsec_X3 lcsec_X4 lcsec_X5 lcsec_X6 lcsec_X8
0.21841891 0.23121761 0.40354893 2.97905223 0.29797653
lgb rf nnet
2.65936070 1.14082510 4.23373004
Degrees of Freedom: 33333 Total (i.e. Null); 33311 Residual
Null Deviance: 33560
Residual Deviance: 21150 AIC: 21200
Model
Model: "sequential"
______________________________________________________________________
Layer (type) Output Shape Param #
======================================================================
dense_features (DenseFeatures multiple 0
)
dense_5 (Dense) multiple 5120
dense_4 (Dense) multiple 32896
dense_3 (Dense) multiple 8256
dense_2 (Dense) multiple 2080
dense_1 (Dense) multiple 528
dropout (Dropout) multiple 0
dense (Dense) multiple 17
======================================================================
Total params: 48,897
Trainable params: 48,897
Non-trainable params: 0
______________________________________________________________________
$num.trees
[1] 3000
$mtry
[1] 1
$min.node.size
[1] 6
$replace
[1] TRUE
$sample.fraction
[1] 0.2
$formula
shrub ~ .
$params
$params$learning_rate
[1] 0.01
$params$nrounds
[1] 2500
$params$num_leaves
[1] 14
$params$max_depth
[1] -1
$params$extra_trees
[1] FALSE
$params$min_data_in_leaf
[1] 10
$params$bagging_fraction
[1] 0.5
$params$bagging_freq
[1] 1
$params$feature_fraction
[1] 0.9
$params$min_data_in_bin
[1] 3
$params$lambda_l1
[1] 0
$params$lambda_l2
[1] 0.5
$params$force_col_wise
[1] TRUE
If you see mistakes or want to suggest changes, please create an issue on the source repository.
For attribution, please cite this work as
Mahoney (2022, Jan. 18). CAFRI Labs: Shrubland 1.0.1: Supersize Me. Retrieved from https://cafri-labs.github.io/acceptable-growing-stock/posts/shrubland-101-supersize-me/
BibTeX citation
@misc{mahoney2022shrubland, author = {Mahoney, Mike}, title = {CAFRI Labs: Shrubland 1.0.1: Supersize Me}, url = {https://cafri-labs.github.io/acceptable-growing-stock/posts/shrubland-101-supersize-me/}, year = {2022} }