Last iteration of these models: 2021-01-12
RF (ranger) | GBM (LightGBM) | SVM (kernlab) | Ensemble (model weighted) | Ensemble (RMSE weighted) | |
---|---|---|---|---|---|
RMSE | 37.486 | 37.754 | 39.656 | 36.980 | 37.749 |
MBE | -2.545 | -3.071 | -7.043 | -2.299 | -4.260 |
R2 | 0.790 | 0.787 | 0.777 | 0.794 | 0.791 |
summary(bind_rows(training, testing)$agb_mgha)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 7.853 84.374 90.496 148.091 425.363
Across 1000 bootstrap iterations, our ensemble model had a mean RMSE of 38.001 \(\pm\) 0.387.
RMSE | Min | Median | Max |
---|---|---|---|
Rf | 34.652 | 41.197 | 46.107 |
Lgb | 35.094 | 41.210 | 47.370 |
Svm | 34.466 | 41.907 | 47.654 |
Ensemble | 34.345 | 40.879 | 45.824 |
R2 | Min | Median | Max |
---|---|---|---|
rf | 0.647 | 0.718 | 0.779 |
lgb | 0.645 | 0.719 | 0.781 |
svm | 0.653 | 0.716 | 0.782 |
ensemble | 0.656 | 0.726 | 0.785 |
lgb rf svm
0.3251130 0.3317681 0.3431190
Call:
lm(formula = agb_mgha ~ rf_pred * lgb_pred * svm_pred, data = pred_values)
Residuals:
Min 1Q Median 3Q Max
-179.815 -20.295 -0.374 14.288 185.205
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.467e+00 5.775e-01 -2.541 0.011059 *
rf_pred 6.255e-01 9.170e-02 6.822 9.23e-12 ***
lgb_pred 4.825e-01 9.227e-02 5.229 1.72e-07 ***
svm_pred -1.321e-02 6.046e-02 -0.219 0.826981
rf_pred:lgb_pred -3.625e-03 4.577e-04 -7.920 2.48e-15 ***
rf_pred:svm_pred 1.117e-03 6.763e-04 1.651 0.098763 .
lgb_pred:svm_pred 1.355e-03 7.588e-04 1.786 0.074101 .
rf_pred:lgb_pred:svm_pred 5.164e-06 1.435e-06 3.600 0.000319 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 40.61 on 23692 degrees of freedom
Multiple R-squared: 0.7222, Adjusted R-squared: 0.7221
F-statistic: 8799 on 7 and 23692 DF, p-value: < 2.2e-16
Random forest:
$num.trees
[1] 750
$mtry
[1] 37
$min.node.size
[1] 7
$sample.fraction
[1] 0.2
$splitrule
[1] "variance"
$replace
[1] TRUE
$formula
agb_mgha ~ .
LGB:
$learning_rate
[1] 0.05
$nrounds
[1] 100
$num_leaves
[1] 22
$max_depth
[1] -1
$extra_trees
[1] TRUE
$min_data_in_leaf
[1] 10
$bagging_fraction
[1] 0.8
$bagging_freq
[1] 1
$feature_fraction
[1] 1
$min_data_in_bin
[1] 8
$lambda_l1
[1] 0.1
$lambda_l2
[1] 4
$force_col_wise
[1] TRUE
SVM:
$x
agb_mgha ~ .
$kernel
[1] "laplacedot"
$type
[1] "eps-bsvr"
$kpar
$kpar$sigma
[1] 0.015625
$C
[1] 1
$epsilon
[1] 0.0625