Last iteration of these models: 2021-01-12
| RF (ranger) | GBM (LightGBM) | SVM (kernlab) | Ensemble (model weighted) | Ensemble (RMSE weighted) | |
|---|---|---|---|---|---|
| RMSE | 37.486 | 37.754 | 39.656 | 36.980 | 37.749 | 
| MBE | -2.545 | -3.071 | -7.043 | -2.299 | -4.260 | 
| R2 | 0.790 | 0.787 | 0.777 | 0.794 | 0.791 | 
summary(bind_rows(training, testing)$agb_mgha)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   7.853  84.374  90.496 148.091 425.363 
Across 1000 bootstrap iterations, our ensemble model had a mean RMSE of 38.001 \(\pm\) 0.387.
| RMSE | Min | Median | Max | 
|---|---|---|---|
| Rf | 34.652 | 41.197 | 46.107 | 
| Lgb | 35.094 | 41.210 | 47.370 | 
| Svm | 34.466 | 41.907 | 47.654 | 
| Ensemble | 34.345 | 40.879 | 45.824 | 
| R2 | Min | Median | Max | 
|---|---|---|---|
| rf | 0.647 | 0.718 | 0.779 | 
| lgb | 0.645 | 0.719 | 0.781 | 
| svm | 0.653 | 0.716 | 0.782 | 
| ensemble | 0.656 | 0.726 | 0.785 | 
      lgb        rf       svm 
0.3251130 0.3317681 0.3431190 
Call:
lm(formula = agb_mgha ~ rf_pred * lgb_pred * svm_pred, data = pred_values)
Residuals:
     Min       1Q   Median       3Q      Max 
-179.815  -20.295   -0.374   14.288  185.205 
Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -1.467e+00  5.775e-01  -2.541 0.011059 *  
rf_pred                    6.255e-01  9.170e-02   6.822 9.23e-12 ***
lgb_pred                   4.825e-01  9.227e-02   5.229 1.72e-07 ***
svm_pred                  -1.321e-02  6.046e-02  -0.219 0.826981    
rf_pred:lgb_pred          -3.625e-03  4.577e-04  -7.920 2.48e-15 ***
rf_pred:svm_pred           1.117e-03  6.763e-04   1.651 0.098763 .  
lgb_pred:svm_pred          1.355e-03  7.588e-04   1.786 0.074101 .  
rf_pred:lgb_pred:svm_pred  5.164e-06  1.435e-06   3.600 0.000319 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 40.61 on 23692 degrees of freedom
Multiple R-squared:  0.7222,    Adjusted R-squared:  0.7221 
F-statistic:  8799 on 7 and 23692 DF,  p-value: < 2.2e-16
Random forest:
$num.trees
[1] 750
$mtry
[1] 37
$min.node.size
[1] 7
$sample.fraction
[1] 0.2
$splitrule
[1] "variance"
$replace
[1] TRUE
$formula
agb_mgha ~ .
LGB:
$learning_rate
[1] 0.05
$nrounds
[1] 100
$num_leaves
[1] 22
$max_depth
[1] -1
$extra_trees
[1] TRUE
$min_data_in_leaf
[1] 10
$bagging_fraction
[1] 0.8
$bagging_freq
[1] 1
$feature_fraction
[1] 1
$min_data_in_bin
[1] 8
$lambda_l1
[1] 0.1
$lambda_l2
[1] 4
$force_col_wise
[1] TRUE
SVM:
$x
agb_mgha ~ .
$kernel
[1] "laplacedot"
$type
[1] "eps-bsvr"
$kpar
$kpar$sigma
[1] 0.015625
$C
[1] 1
$epsilon
[1] 0.0625