Model Updates

Mike Mahoney true
2021-01-06

Evaluation Results

Last iteration of these models: 2020-12-31

Change Summary

RF (ranger) GBM (LightGBM) SVM (kernlab) Ensemble (model weighted) Ensemble (RMSE weighted)
RMSE 38.617 38.496 38.574 37.546 37.798
MBE -1.313 -0.878 -5.357 -2.230 -2.547
R2 0.761 0.761 0.768 0.774 0.773

AGB Distribution

summary(bind_rows(training, testing)$agb_mgha)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   9.645  86.795  91.792 148.679 425.363 

Bootstrapping Results

Across 1000 bootstrap iterations, our ensemble model had a mean RMSE of 37.97 \(\pm\) 0.355.

RMSE Distribution

Plot Errors

Validation Results

RMSE Min Median Max
Rf 34.981 39.895 45.285
Lgb 36.127 40.427 46.676
Svm 35.101 39.619 45.833
Ensemble 35.316 39.255 44.952
R2 Min Median Max
rf 0.684 0.740 0.792
lgb 0.661 0.732 0.791
svm 0.674 0.750 0.808
ensemble 0.684 0.749 0.801

Metadata

Ensembles

      lgb        rf       svm 
0.3288537 0.3307084 0.3404378 

Call:
lm(formula = agb_mgha ~ rf_pred * lgb_pred * svm_pred, data = pred_values)

Residuals:
     Min       1Q   Median       3Q      Max 
-140.574  -20.252   -0.105   12.575  211.587 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -8.703e-01  5.548e-01  -1.569   0.1167    
rf_pred                    3.020e-01  6.630e-02   4.556 5.25e-06 ***
lgb_pred                  -4.926e-03  6.565e-02  -0.075   0.9402    
svm_pred                   7.300e-01  5.500e-02  13.272  < 2e-16 ***
rf_pred:lgb_pred           7.808e-04  3.958e-04   1.973   0.0485 *  
rf_pred:svm_pred          -8.177e-04  4.952e-04  -1.651   0.0987 .  
lgb_pred:svm_pred         -1.133e-04  5.587e-04  -0.203   0.8393    
rf_pred:lgb_pred:svm_pred  9.406e-07  1.291e-06   0.729   0.4662    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 39.31 on 23792 degrees of freedom
Multiple R-squared:  0.7489,    Adjusted R-squared:  0.7489 
F-statistic: 1.014e+04 on 7 and 23792 DF,  p-value: < 2.2e-16

\(n\) and \(p\)

Component Models

$num.trees
[1] 750

$mtry
[1] 20

$min.node.size
[1] 1

$sample.fraction
[1] 0.5

$splitrule
[1] "maxstat"

$replace
[1] TRUE

$formula
agb_mgha ~ .
$learning_rate
[1] 0.1

$nrounds
[1] 50

$num_leaves
[1] 5

$max_depth
[1] -1

$extra_trees
[1] TRUE

$min_data_in_leaf
[1] 10

$bagging_fraction
[1] 0.3

$bagging_freq
[1] 1

$feature_fraction
[1] 0.5

$min_data_in_bin
[1] 24

$lambda_l1
[1] 0

$lambda_l2
[1] 0.1

$force_col_wise
[1] TRUE
$x
agb_mgha ~ .

$kernel
[1] "laplacedot"

$type
[1] "nu-svr"

$kpar
$kpar$sigma
[1] 0.001953125


$C
[1] 64

$epsilon
[1] 0.001953125

$nu
[1] 1