Last iteration of these models: 2021-01-12
RF (ranger) | GBM (LightGBM) | SVM (kernlab) | Ensemble (model weighted) | Ensemble (RMSE weighted) | |
---|---|---|---|---|---|
RMSE | 38.952 | 38.370 | 39.579 | 38.285 | 38.425 |
MBE | -3.773 | -4.054 | -2.524 | -3.355 | -3.457 |
R2 | 0.769 | 0.774 | 0.760 | 0.773 | 0.775 |
summary(bind_rows(training, testing)$agb_mgha)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 6.163 83.870 90.180 147.950 425.363
Across 1000 bootstrap iterations, our ensemble model had a mean RMSE of 38.559 \(\pm\) 0.331.
RMSE | Min | Median | Max |
---|---|---|---|
Rf | 33.793 | 41.432 | 47.405 |
Lgb | 34.060 | 40.667 | 45.793 |
Svm | 34.286 | 41.297 | 46.833 |
Ensemble | 33.287 | 40.617 | 46.139 |
R2 | Min | Median | Max |
---|---|---|---|
rf | 0.659 | 0.719 | 0.791 |
lgb | 0.667 | 0.731 | 0.788 |
svm | 0.651 | 0.716 | 0.786 |
ensemble | 0.665 | 0.728 | 0.797 |
lgb rf svm
0.3390358 0.3319147 0.3290495
Call:
lm(formula = agb_mgha ~ rf_pred * lgb_pred * svm_pred, data = pred_values)
Residuals:
Min 1Q Median 3Q Max
-184.79 -20.49 -0.15 12.67 201.44
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.325e+00 6.689e-01 -1.980 0.047660 *
rf_pred 8.609e-01 8.655e-02 9.947 < 2e-16 ***
lgb_pred 1.396e-02 9.910e-02 0.141 0.887964
svm_pred 5.553e-02 5.223e-02 1.063 0.287645
rf_pred:lgb_pred -1.132e-03 4.017e-04 -2.819 0.004825 **
rf_pred:svm_pred -4.392e-03 6.936e-04 -6.333 2.45e-10 ***
lgb_pred:svm_pred 7.045e-03 7.750e-04 9.090 < 2e-16 ***
rf_pred:lgb_pred:svm_pred -6.155e-06 1.657e-06 -3.716 0.000203 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 40.36 on 23792 degrees of freedom
Multiple R-squared: 0.7306, Adjusted R-squared: 0.7305
F-statistic: 9218 on 7 and 23792 DF, p-value: < 2.2e-16
Random forest:
$num.trees
[1] 2000
$mtry
[1] 23
$min.node.size
[1] 11
$sample.fraction
[1] 0.2
$splitrule
[1] "maxstat"
$replace
[1] TRUE
$formula
agb_mgha ~ .
LGB:
$learning_rate
[1] 0.05
$nrounds
[1] 100
$num_leaves
[1] 13
$max_depth
[1] -1
$extra_trees
[1] TRUE
$min_data_in_leaf
[1] 10
$bagging_fraction
[1] 0.5
$bagging_freq
[1] 5
$feature_fraction
[1] 0.3
$min_data_in_bin
[1] 13
$lambda_l1
[1] 10
$lambda_l2
[1] 2
$force_col_wise
[1] TRUE
SVM:
$x
agb_mgha ~ .
$kernel
[1] "laplacedot"
$type
[1] "eps-svr"
$kpar
$kpar$sigma
[1] 0.00390625
$C
[1] 9
$epsilon
[1] 0.25