Save SVM

Mike Mahoney true
2021-01-29

Evaluation Results

Last iteration of these models: 2021-01-12

Change Summary

RF (ranger) GBM (LightGBM) SVM (kernlab) Ensemble (model weighted) Ensemble (RMSE weighted)
RMSE 38.952 38.370 39.579 38.285 38.425
MBE -3.773 -4.054 -2.524 -3.355 -3.457
R2 0.769 0.774 0.760 0.773 0.775

AGB Distribution

summary(bind_rows(training, testing)$agb_mgha)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   6.163  83.870  90.180 147.950 425.363 

Bootstrapping Results

Across 1000 bootstrap iterations, our ensemble model had a mean RMSE of 38.559 \(\pm\) 0.331.

RMSE Distribution

Plot Errors

Validation Results

RMSE Min Median Max
Rf 33.793 41.432 47.405
Lgb 34.060 40.667 45.793
Svm 34.286 41.297 46.833
Ensemble 33.287 40.617 46.139
R2 Min Median Max
rf 0.659 0.719 0.791
lgb 0.667 0.731 0.788
svm 0.651 0.716 0.786
ensemble 0.665 0.728 0.797

Metadata

Ensembles

      lgb        rf       svm 
0.3390358 0.3319147 0.3290495 

Call:
lm(formula = agb_mgha ~ rf_pred * lgb_pred * svm_pred, data = pred_values)

Residuals:
    Min      1Q  Median      3Q     Max 
-184.79  -20.49   -0.15   12.67  201.44 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -1.325e+00  6.689e-01  -1.980 0.047660 *  
rf_pred                    8.609e-01  8.655e-02   9.947  < 2e-16 ***
lgb_pred                   1.396e-02  9.910e-02   0.141 0.887964    
svm_pred                   5.553e-02  5.223e-02   1.063 0.287645    
rf_pred:lgb_pred          -1.132e-03  4.017e-04  -2.819 0.004825 ** 
rf_pred:svm_pred          -4.392e-03  6.936e-04  -6.333 2.45e-10 ***
lgb_pred:svm_pred          7.045e-03  7.750e-04   9.090  < 2e-16 ***
rf_pred:lgb_pred:svm_pred -6.155e-06  1.657e-06  -3.716 0.000203 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 40.36 on 23792 degrees of freedom
Multiple R-squared:  0.7306,    Adjusted R-squared:  0.7305 
F-statistic:  9218 on 7 and 23792 DF,  p-value: < 2.2e-16

Coverages

\(n\) and \(p\)

Component Models

Random forest:

$num.trees
[1] 2000

$mtry
[1] 23

$min.node.size
[1] 11

$sample.fraction
[1] 0.2

$splitrule
[1] "maxstat"

$replace
[1] TRUE

$formula
agb_mgha ~ .

LGB:

$learning_rate
[1] 0.05

$nrounds
[1] 100

$num_leaves
[1] 13

$max_depth
[1] -1

$extra_trees
[1] TRUE

$min_data_in_leaf
[1] 10

$bagging_fraction
[1] 0.5

$bagging_freq
[1] 5

$feature_fraction
[1] 0.3

$min_data_in_bin
[1] 13

$lambda_l1
[1] 10

$lambda_l2
[1] 2

$force_col_wise
[1] TRUE

SVM:

$x
agb_mgha ~ .

$kernel
[1] "laplacedot"

$type
[1] "eps-svr"

$kpar
$kpar$sigma
[1] 0.00390625


$C
[1] 9

$epsilon
[1] 0.25