Sans Crown Predictors

Mike Mahoney true
2021-01-19

Evaluation Results

Last iteration of these models: 2021-01-12

Change Summary

RF (ranger) GBM (LightGBM) SVM (kernlab) Ensemble (model weighted) Ensemble (RMSE weighted)
RMSE 37.486 37.754 39.656 36.980 37.749
MBE -2.545 -3.071 -7.043 -2.299 -4.260
R2 0.790 0.787 0.777 0.794 0.791

AGB Distribution

summary(bind_rows(training, testing)$agb_mgha)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   7.853  84.374  90.496 148.091 425.363 

Bootstrapping Results

Across 1000 bootstrap iterations, our ensemble model had a mean RMSE of 38.001 \(\pm\) 0.387.

RMSE Distribution

Plot Errors

Validation Results

RMSE Min Median Max
Rf 34.652 41.197 46.107
Lgb 35.094 41.210 47.370
Svm 34.466 41.907 47.654
Ensemble 34.345 40.879 45.824
R2 Min Median Max
rf 0.647 0.718 0.779
lgb 0.645 0.719 0.781
svm 0.653 0.716 0.782
ensemble 0.656 0.726 0.785

Metadata

Ensembles

      lgb        rf       svm 
0.3251130 0.3317681 0.3431190 

Call:
lm(formula = agb_mgha ~ rf_pred * lgb_pred * svm_pred, data = pred_values)

Residuals:
     Min       1Q   Median       3Q      Max 
-179.815  -20.295   -0.374   14.288  185.205 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -1.467e+00  5.775e-01  -2.541 0.011059 *  
rf_pred                    6.255e-01  9.170e-02   6.822 9.23e-12 ***
lgb_pred                   4.825e-01  9.227e-02   5.229 1.72e-07 ***
svm_pred                  -1.321e-02  6.046e-02  -0.219 0.826981    
rf_pred:lgb_pred          -3.625e-03  4.577e-04  -7.920 2.48e-15 ***
rf_pred:svm_pred           1.117e-03  6.763e-04   1.651 0.098763 .  
lgb_pred:svm_pred          1.355e-03  7.588e-04   1.786 0.074101 .  
rf_pred:lgb_pred:svm_pred  5.164e-06  1.435e-06   3.600 0.000319 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 40.61 on 23692 degrees of freedom
Multiple R-squared:  0.7222,    Adjusted R-squared:  0.7221 
F-statistic:  8799 on 7 and 23692 DF,  p-value: < 2.2e-16

Coverages

\(n\) and \(p\)

Component Models

Random forest:

$num.trees
[1] 750

$mtry
[1] 37

$min.node.size
[1] 7

$sample.fraction
[1] 0.2

$splitrule
[1] "variance"

$replace
[1] TRUE

$formula
agb_mgha ~ .

LGB:

$learning_rate
[1] 0.05

$nrounds
[1] 100

$num_leaves
[1] 22

$max_depth
[1] -1

$extra_trees
[1] TRUE

$min_data_in_leaf
[1] 10

$bagging_fraction
[1] 0.8

$bagging_freq
[1] 1

$feature_fraction
[1] 1

$min_data_in_bin
[1] 8

$lambda_l1
[1] 0.1

$lambda_l2
[1] 4

$force_col_wise
[1] TRUE

SVM:

$x
agb_mgha ~ .

$kernel
[1] "laplacedot"

$type
[1] "eps-bsvr"

$kpar
$kpar$sigma
[1] 0.015625


$C
[1] 1

$epsilon
[1] 0.0625