First draft of graphics and tables for shrubland paper

A living document with set pieces for the shrubland paper.

Mike Mahoney true
2022-01-23

Figures

NY Population

Population per square kilometer for each census block group in New York State, as reported in the 2019 5 year American Community Survey.

Figure 1: Population per square kilometer for each census block group in New York State, as reported in the 2019 5 year American Community Survey.

Probability by height

Predicted probability, from the logistic ensemble model, of shrubland as a function of pixel heights. Each point represents one of one million pixels randomly sampled from all LiDAR coverage areas. Shaded area between vertical lines represents the 1-5m height threshold used to define "shrubland" for this study. Trendline shown is a generalized additive model fit using penalized cubic regression splines.

Figure 2: Predicted probability, from the logistic ensemble model, of shrubland as a function of pixel heights. Each point represents one of one million pixels randomly sampled from all LiDAR coverage areas. Shaded area between vertical lines represents the 1-5m height threshold used to define “shrubland” for this study. Trendline shown is a generalized additive model fit using penalized cubic regression splines.

Probability by class

Smoothed kernel density estimates of predicted probability of shrubland for both shrubland and non-shrubland pixels, calculated using a random sample of 1,000,000 pixels taken from the LiDAR patchwork prediction surface using the logistic ensemble model. Vertical lines indicate each of the four probability thresholds used to classify pixels. Colors represent the correct classification of the pixel.

Figure 3: Smoothed kernel density estimates of predicted probability of shrubland for both shrubland and non-shrubland pixels, calculated using a random sample of 1,000,000 pixels taken from the LiDAR patchwork prediction surface using the logistic ensemble model. Vertical lines indicate each of the four probability thresholds used to classify pixels. Colors represent the correct classification of the pixel.

LiDAR Shrub Map

Identified shrubland areas within each available LiDAR coverage. Shrubland was defined at a 1 meter resolution as being any area within a vegetated LCPRI land cover class and below 1067 meters elevation with a LiDAR-derived height between 1 and 5 meters. 30 meter pixels, used for analysis and modeling, were then defined as shrubland if more than 50% of their contained 1 meter pixels were classified as shrubland. In total, approximately 2.5% of 30 meter pixels were classified as shrubland.

Figure 4: Identified shrubland areas within each available LiDAR coverage. Shrubland was defined at a 1 meter resolution as being any area within a vegetated LCPRI land cover class and below 1067 meters elevation with a LiDAR-derived height between 1 and 5 meters. 30 meter pixels, used for analysis and modeling, were then defined as shrubland if more than 50% of their contained 1 meter pixels were classified as shrubland. In total, approximately 2.5% of 30 meter pixels were classified as shrubland.

LiDAR probabilities (modeled)

Predicted probability of shrubland for the boundaries of all used LiDAR coverages, from the logistic ensemble model. Predictions were made using data reflecting the same year as LiDAR acquisition; the map therefore represents a temporal patchwork of predictions. Pixels in non-vegetated LCPRI land cover classes (developed, water, ice/snow, and barren) or above 1067 meters in elevation were not mapped and are shown in white.

Figure 5: Predicted probability of shrubland for the boundaries of all used LiDAR coverages, from the logistic ensemble model. Predictions were made using data reflecting the same year as LiDAR acquisition; the map therefore represents a temporal patchwork of predictions. Pixels in non-vegetated LCPRI land cover classes (developed, water, ice/snow, and barren) or above 1067 meters in elevation were not mapped and are shown in white.

LiDAR Classifications

Predicted shrubland locations within each LiDAR coverage, from the logistic ensemble model. Predicted pixel probabilities were classified using either the Youden-optimal threshold (which maximizes both sensitivity and specificity) or a threshold chosen to target a certain level of specificity, using thresholds derived from the validation data set. Predictions were made using data reflecting the same year as LiDAR acquisition; the map therefore represents a temporal patchwork of predictions. Pixels in non-vegetated LCPRI land cover classes (developed, water, ice/snow, and barren) or above 1067 meters in elevation were not mapped and are shown in white.

Figure 6: Predicted shrubland locations within each LiDAR coverage, from the logistic ensemble model. Predicted pixel probabilities were classified using either the Youden-optimal threshold (which maximizes both sensitivity and specificity) or a threshold chosen to target a certain level of specificity, using thresholds derived from the validation data set. Predictions were made using data reflecting the same year as LiDAR acquisition; the map therefore represents a temporal patchwork of predictions. Pixels in non-vegetated LCPRI land cover classes (developed, water, ice/snow, and barren) or above 1067 meters in elevation were not mapped and are shown in white.

2019 Probability Map

Predicted probability of shrubland for 2019 across all mapped areas within New York State, from the logistic ensemble model. Pixels in non-vegetated LCPRI land cover classes (developed, water, ice/snow, and barren) or above 1067 meters in elevation were not mapped and are shown in white.

Figure 7: Predicted probability of shrubland for 2019 across all mapped areas within New York State, from the logistic ensemble model. Pixels in non-vegetated LCPRI land cover classes (developed, water, ice/snow, and barren) or above 1067 meters in elevation were not mapped and are shown in white.

2019 Classifications

Predicted shrubland locations across the state for 2019, from the logistic ensemble model. Predicted pixel probabilities were classified using either the Youden-optimal threshold (which maximizes both sensitivity and specificity) or a threshold chosen to target a certain level of specificity, using thresholds derived from the validation data set. Pixels in non-vegetated LCPRI land cover classes (developed, water, ice/snow, and barren) or above 1067 meters in elevation were not mapped and are shown in white.

Figure 8: Predicted shrubland locations across the state for 2019, from the logistic ensemble model. Predicted pixel probabilities were classified using either the Youden-optimal threshold (which maximizes both sensitivity and specificity) or a threshold chosen to target a certain level of specificity, using thresholds derived from the validation data set. Pixels in non-vegetated LCPRI land cover classes (developed, water, ice/snow, and barren) or above 1067 meters in elevation were not mapped and are shown in white.

Coverages Map

Boundaries for all LiDAR coverages used in this project, colored by year of data acquisition. More information about each coverage is included as Supplementary Materials S1.

Figure 9: Boundaries for all LiDAR coverages used in this project, colored by year of data acquisition. More information about each coverage is included as Supplementary Materials S1.

Tables

This table looks better in the PDF, I swear.

Table 1: Model accuracy metrics for logistic ensemble model with predictions classified using various thresholds, calculated using both the balanced test set and the LiDAR patchwork surface. AUC for the LiDAR patchwork was calculated using a random sample of 1,000,000 pixels, while all other metrics used all predicted pixels. Thresholds were selected using a separate validation set, using values chosen to maximize the Youden J statistic (“Youden optimal”) or to target a certain minimum specificity (“% specificity”).
Threshold Sensitivity Specificity Precision F1
Test set (AUC: 0.893)
Youden optimal 0.489 0.842 0.780 0.791 0.816
90% specificity 0.755 0.659 0.900 0.867 0.807
95% specificity 0.840 0.496 0.949 0.906 0.641
99% specificity 0.907 0.218 0.989 0.952 0.355
LiDAR patchwork (AUC: 0.904)
Youden optimal 0.489 0.858 0.783 0.094 0.169
90% specificity 0.755 0.689 0.896 0.149 0.245
95% specificity 0.840 0.514 0.951 0.219 0.307
99% specificity 0.907 0.247 0.989 0.376 0.298

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Citation

For attribution, please cite this work as

Mahoney (2022, Jan. 23). CAFRI Labs: First draft of graphics and tables for shrubland paper. Retrieved from https://cafri-labs.github.io/acceptable-growing-stock/posts/first-draft-of-graphics-and-tables-for-shrubland-paper/

BibTeX citation

@misc{mahoney2022first,
  author = {Mahoney, Mike},
  title = {CAFRI Labs: First draft of graphics and tables for shrubland paper},
  url = {https://cafri-labs.github.io/acceptable-growing-stock/posts/first-draft-of-graphics-and-tables-for-shrubland-paper/},
  year = {2022}
}