The Big Tune

Holdout set accuracy for the Big Tune, 2021-02-03

Mike Mahoney true
2020-02-03

Evaluation Results

Last iteration of these models: 2021-01-28

Change Summary

RF (ranger) GBM (LightGBM) SVM (kernlab) Ensemble (model weighted) Ensemble (RMSE weighted)
RMSE 36.292 35.721 36.041 35.479 35.218
MBE 4.190 3.189 1.136 3.206 2.854
R2 0.783 0.789 0.783 0.791 0.794

AGB Distribution

summary(bind_rows(training, testing)$agb_mgha)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   7.096  86.038  91.674 149.404 425.363 

Bootstrapping Results

Across 1000 bootstrap iterations, our ensemble model had a mean RMSE of 35.325 \(\pm\) 0.343.

RMSE Distribution

Plot Errors

Validation Results

RMSE Min Median Max
Rf 34.151 40.051 45.617
Lgb 33.807 40.151 46.160
Svm 35.718 41.305 47.601
Ensemble 33.932 39.867 45.522
R2 Min Median Max
rf 0.692 0.749 0.792
lgb 0.685 0.745 0.791
svm 0.644 0.730 0.780
ensemble 0.695 0.750 0.792

Metadata

Ensembles

      lgb        rf       svm 
0.3362089 0.3365316 0.3272595 

Call:
lm(formula = agb_mgha ~ rf_pred * lgb_pred * svm_pred, data = pred_values)

Residuals:
     Min       1Q   Median       3Q      Max 
-134.264  -20.966   -0.685   15.633  217.642 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -2.349e+00  5.795e-01  -4.054 5.05e-05 ***
rf_pred                    1.976e-01  8.567e-02   2.307  0.02109 *  
lgb_pred                   7.129e-01  8.905e-02   8.006 1.24e-15 ***
svm_pred                   1.945e-01  4.961e-02   3.922 8.82e-05 ***
rf_pred:lgb_pred          -5.341e-04  3.425e-04  -1.560  0.11888    
rf_pred:svm_pred           3.308e-03  6.158e-04   5.371 7.90e-08 ***
lgb_pred:svm_pred         -3.968e-03  6.189e-04  -6.412 1.47e-10 ***
rf_pred:lgb_pred:svm_pred  3.916e-06  1.285e-06   3.047  0.00231 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 39.84 on 22992 degrees of freedom
Multiple R-squared:  0.7472,    Adjusted R-squared:  0.7471 
F-statistic:  9709 on 7 and 22992 DF,  p-value: < 2.2e-16

Coverages

\(n\) and \(p\)

Component Models

Random forest:

$num.trees
[1] 1000

$mtry
[1] 18

$min.node.size
[1] 7

$sample.fraction
[1] 0.2

$splitrule
[1] "variance"

$replace
[1] TRUE

$formula
agb_mgha ~ .

LGB:

$learning_rate
[1] 0.05

$nrounds
[1] 100

$num_leaves
[1] 5

$max_depth
[1] 2

$extra_trees
[1] TRUE

$min_data_in_leaf
[1] 10

$bagging_fraction
[1] 0.3

$bagging_freq
[1] 1

$feature_fraction
[1] 0.4

$min_data_in_bin
[1] 8

$lambda_l1
[1] 5

$lambda_l2
[1] 1

$force_col_wise
[1] TRUE

SVM:

$x
agb_mgha ~ .

$kernel
[1] "laplacedot"

$type
[1] "eps-svr"

$kpar
$kpar$sigma
[1] 0.0078125


$C
[1] 12

$epsilon
[1] 1.525879e-05

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Citation

For attribution, please cite this work as

Mahoney (2021, Feb. 3). CAFRI Labs: The Big Tune. Retrieved from https://cafri-labs.github.io/acceptable-growing-stock/posts/the-big-tune/

BibTeX citation

@misc{mahoney2021the,
  author = {Mahoney, Mike},
  title = {CAFRI Labs: The Big Tune},
  url = {https://cafri-labs.github.io/acceptable-growing-stock/posts/the-big-tune/},
  year = {2021}
}