Sans Crown Predictors

Evaluation Results

Last iteration of these models: 2021-01-12

Change Summary

Dropped crown segmentation predictors (again)
- Accidentally kept crown segmentation predictors in-model for past iterations

	RF (ranger)	GBM (LightGBM)	SVM (kernlab)	Ensemble (model weighted)	Ensemble (RMSE weighted)
RMSE	37.486	37.754	39.656	36.980	37.749
MBE	-2.545	-3.071	-7.043	-2.299	-4.260
R2	0.790	0.787	0.777	0.794	0.791

AGB Distribution

summary(bind_rows(training, testing)$agb_mgha)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   7.853  84.374  90.496 148.091 425.363

Bootstrapping Results

Across 1000 bootstrap iterations, our ensemble model had a mean RMSE of 38.001 \(\pm\) 0.387.

RMSE Distribution

Plot Errors

Validation Results

RMSE	Min	Median	Max
Rf	34.652	41.197	46.107
Lgb	35.094	41.210	47.370
Svm	34.466	41.907	47.654
Ensemble	34.345	40.879	45.824

R2	Min	Median	Max
rf	0.647	0.718	0.779
lgb	0.645	0.719	0.781
svm	0.653	0.716	0.782
ensemble	0.656	0.726	0.785

Metadata

Ensembles

RMSE-weighted model weights:

      lgb        rf       svm 
0.3251130 0.3317681 0.3431190

Linear model weights:


Call:
lm(formula = agb_mgha ~ rf_pred * lgb_pred * svm_pred, data = pred_values)

Residuals:
     Min       1Q   Median       3Q      Max 
-179.815  -20.295   -0.374   14.288  185.205 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -1.467e+00  5.775e-01  -2.541 0.011059 *  
rf_pred                    6.255e-01  9.170e-02   6.822 9.23e-12 ***
lgb_pred                   4.825e-01  9.227e-02   5.229 1.72e-07 ***
svm_pred                  -1.321e-02  6.046e-02  -0.219 0.826981    
rf_pred:lgb_pred          -3.625e-03  4.577e-04  -7.920 2.48e-15 ***
rf_pred:svm_pred           1.117e-03  6.763e-04   1.651 0.098763 .  
lgb_pred:svm_pred          1.355e-03  7.588e-04   1.786 0.074101 .  
rf_pred:lgb_pred:svm_pred  5.164e-06  1.435e-06   3.600 0.000319 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 40.61 on 23692 degrees of freedom
Multiple R-squared:  0.7222,    Adjusted R-squared:  0.7221 
F-statistic:  8799 on 7 and 23692 DF,  p-value: < 2.2e-16

Coverages

18 coverages:
- FEMA_FranklinStLawrence2016, FEMA_FultonSaratogaHerkimerFran, FEMA_GreatLakes2014, FEMA_HudsonHoosic2012, FEMA_OniedaSubbasin2016, NYSGPO_AlleganySteuben2016, NYSGPO_CayugaOswego_2018, NYSGPO_ColumbiaRensselaer2016, NYSGPO_ErieGeneseeLivingston201, NYSGPO_MadisonOtsego_2015, NYSGPO_Southwest_spring_2017, NYSGPO_SouthwestB_fall_2017, NYSGPO_WarrenWashingtonEssex_20, USGS_3County2014, USGS_ClintonEssexFranklin2014, USGS_LongIsland2014, USGS_NorthEast2011, USGS_Schoharie2014

\(n\) and \(p\)

1142 observations
- 790 training
- 352 testing
73 predictors
- n, zmean, zmean_c, max, quad_mean, quad_mean_c, cv, cv_c, z_kurt, z_skew, L2, L3, L4, L_cv, L_skew, L_kurt, h10, h20, h30, h40, h50, h60, h70, h80, h90, h95, h99, hvol, cancov, rpc1, d10, d20, d30, d40, d50, d60, d70, d80, d90, precip, tmin, tmax, twi, slope, aspect, elev, tax_code_105, tax_code_210, tax_code_240, tax_code_260, tax_code_280, tax_code_311, tax_code_312, tax_code_314, tax_code_322, tax_code_323, tax_code_910, tax_code_911, tax_code_912, tax_code_931, tax_code_932, tax_code_1000, tax_category_100, tax_category_200, tax_category_300, tax_category_900, tax_code_112, tax_code_120, tax_code_241, tax_code_321, tax_code_930, tax_code_941, tax_code_2000

Component Models

Tuning used 5-fold CV
Final hyperparameters:

Random forest:

$num.trees
[1] 750

$mtry
[1] 37

$min.node.size
[1] 7

$sample.fraction
[1] 0.2

$splitrule
[1] "variance"

$replace
[1] TRUE

$formula
agb_mgha ~ .

LGB:

$learning_rate
[1] 0.05

$nrounds
[1] 100

$num_leaves
[1] 22

$max_depth
[1] -1

$extra_trees
[1] TRUE

$min_data_in_leaf
[1] 10

$bagging_fraction
[1] 0.8

$bagging_freq
[1] 1

$feature_fraction
[1] 1

$min_data_in_bin
[1] 8

$lambda_l1
[1] 0.1

$lambda_l2
[1] 4

$force_col_wise
[1] TRUE

SVM:

$x
agb_mgha ~ .

$kernel
[1] "laplacedot"

$type
[1] "eps-bsvr"

$kpar
$kpar$sigma
[1] 0.015625


$C
[1] 1

$epsilon
[1] 0.0625