Modeling aboveground biomass from geospatial and FIA field data with the Gradient Nearest Neighbor (GNN) method.
Gradient Nearest Neighbor (GNN) is an imputation method due to Janet L. Ohmann and Matthew J. Gregory[2]. The method combines canonical correspondence analysis (CCA) with k-nearest neighbor (kNN) imputation.
CCA generates ordination axes from two distinct sets of predictor variables, e.g. ‘species’ and ‘environmental’ variables. In GNN, the two categories are typically ‘geospatial data’, e.g. topography or LANDSAT, and ‘field data’, e.g. species abundance.
Once the CCA is conducted, the kNN imputation can then be performed in the CCA (ordination) space as opposed to the original predictor variable space.
The original application for GNN imputation was species composition mapping in coastal Oregon[2]. More recently, John J. Battles et al.[1] used GNN imputation to estimate AGB in California at the regional level. Here, we attempt to imitate their implementation in New York based on similar predictor data.
For GNN-AGB 0.0.1, we used Canonical Correlation Analysis to generate the ordination axes, as opposed to Canonical Correspondence Analysis, which has historically been used for GNN.
Geospatial data were obtained at the locations of 1,977 FIA plots in New York State. Geospatial predictors from Battles et al. (2018) and GNN-AGB 0.0.1. are listed below, for comparison.
Battles et al. (2018) | GNN-AGB 0.0.1 | |
---|---|---|
Topography | ASPTR: Cosine transformation of aspect | ASPTR |
DEM: Elevation from a digital elevation map (m) | DEM | |
PRR: Potential relative radiation (unitless) | ||
SLPPCT: Slope (%) | SLOPE | |
TPI450: Topographic position index | TWI: Topographic wetness index | |
Climate | ANNPRE: Mean annual precipitation (ln[mm]) | PRECIP: 30-year normal (in) |
ANNTMP: Mean annual temperature (\(^\circ\)C) | ||
AUGMAXT: Mean maximum temperature of August (\(^\circ\)C) | TMAX: Mean maximum annual temperature (\(^\circ\)C) | |
DECMINT: Mean minimum temperature of December (\(^\circ\)C) | TMIN: Mean minimum annual temperature (\(^\circ\)C) | |
SMRTP: Ratio of mean temperature (\(^\circ\)C) to precipitation (ln[mm]) of May-Sept. | ||
LANDSAT | TC1: Brightness (i.e., axis 1 of the tassel cap transformation) | TCB |
TC2: Greenness (i.e., axis 2 of the tassel cap transformation) | TCG | |
TC3: Wetness (i.e., axis 3 of the tassel cap transformation) | TCW | |
NBR: Normalized burn ratio (unitless) | NBR | |
Change | \(\Delta\) TC1: Mean change in TC1 during previous 6 years | \(\Delta\) TCB: Mean change in TCB during previous year |
\(\Delta\) TC2: Mean change in TC2 during previous 6 years | \(\Delta\) TCG: Mean change in TCG during previous year | |
\(\Delta\) TC3: Mean change in TC3 during previous 6 years | \(\Delta\) TCW: Mean change in TCB during previous year | |
\(\Delta\) NBR: Mean change in NBR during previous 6 years | \(\Delta\) NBR: Mean change in TCB during previous year | |
Geology/Soils | ROCKDEPTH: Rock depth (cm) | |
BD_30: Bulk density of soils 0 cm - 30 cm (g cm -3) | ||
PERM_30: Permeability of soils 0 cm - 30 cm (m 2) | ||
PH_30: Mean pH of soils 0 cm - 30 cm | ||
RVOL_30: Rock volume of soils 0 cm - 30 cm (cm 3) | ||
Location | COASTPROX: Distance to the Pacific Ocean (km) | |
LAT: Latitude (\(^\circ\)) | ||
LON: Longitude (\(^\circ\)) |
In both Battles et al. (2018) and GNN-AGB 0.0.1, tree species matrices were obtained from FIA data. These matrices constitute the ‘field data’ category of predictors.
Canonical Correlation Analysis was conducted in R with the CCA::cc() function. Of the sixteen axes generated by CCA::cc(), we retained ten based on the significance of the Wilks’ Lambda test statistic, with alpha = 0.05.
CCA Axis | Wilks Lambda | p.value |
---|---|---|
1 | 0.0340950 | 0.0000000 |
2 | 0.0969175 | 0.0000000 |
3 | 0.2485243 | 0.0000000 |
4 | 0.3791973 | 0.0000000 |
5 | 0.4544100 | 0.0000000 |
6 | 0.5435356 | 0.0000000 |
7 | 0.6109630 | 0.0000000 |
8 | 0.6802785 | 0.0000000 |
9 | 0.7413935 | 0.0000019 |
10 | 0.7928830 | 0.0008046 |
11 | 0.8424386 | 0.0534617 |
12 | 0.8909325 | 0.5819887 |
13 | 0.9266623 | 0.9189674 |
14 | 0.9530076 | 0.9788676 |
15 | 0.9710587 | 0.9692588 |
16 | 0.9861286 | 0.9033297 |
We used a 30% holdout set to evaluate our GNN model. For each FIA plot in the (30%) holdout set, AGB values were estimated by distance-weighted kNN imputation, drawing from the remaining (70%) of data points in the 10-dimensional ordination space derived from the CCA. The model was then evaluated on the holdout set by comparing the AGB estimates from FIA field data against the GNN imputation. We repeated this process for each value of k from 1 to 100.
Battles: k1 | Battles: k10 | GNN-AGB: k1 | GNN-AGB: k10 | GNN-AGB: k30 | GNN-AGB: k60 | |
---|---|---|---|---|---|---|
NRMSE | 0.765 | 0.693 | 0.579 | 0.425 | 0.421 | 0.425 |
R2 | 0.461 | 0.557 | 0.100 | 0.249 | 0.277 | 0.292 |
[1] Battles, J. et al. (2018). Innovations in measuring and managing forest carbon stocks in California. A Report for: California’s Fourth Climate Change Assessment, 99.
[2] Ohmann, J. L., & Gregory, M. J. (2002). Predictive mapping of forest composition and structure with direct gradient analysis and nearest-neighbor imputation in coastal Oregon, USA. Canadian Journal of Forest Research, 32(4), 725-741.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
For attribution, please cite this work as
Gordon (2022, Aug. 31). CAFRI Labs: GNN-AGB 0.0.1. Retrieved from https://cafri-labs.github.io/acceptable-growing-stock/posts/gnn-agb-001/
BibTeX citation
@misc{gordon2022gnn-agb, author = {Gordon, Sam}, title = {CAFRI Labs: GNN-AGB 0.0.1}, url = {https://cafri-labs.github.io/acceptable-growing-stock/posts/gnn-agb-001/}, year = {2022} }