Tree height prediction in Brazilian Khaya ivorensis stands Predicción de la altura en plantaciones brasileñas de Khaya ivorensis

Tree height measurement is one of the most difficult activities in forest inventory data gathering, although it is a fundamental variable to support forest management, since it is an input for modelling growth and yield. To overcome this obstacle and ensure that the heights of trees are estimated accurately, hypsometric relationships are used. Therefore, the objective of this study was to compare different fitting strategies (i.e. nonlinear least squares and mixed-effects) to predict tree height in African mahogany Brazilian plantations using well know local (using only tree height and diameter) and generalized (using height, diameter and plot level variables) models. Data were gathered on 149 permanent plots sampled in different Brazilian regions and ages, totaling 4,201 height-diameter pairs. Different models were evaluated and the best method to estimate the height-diameter relationship was based on statistical and graphical criteria. A local model using mixed-effects with correction of heteroscedasticity was efficient and superior to other models evaluated. However, when using an independent data base, the generalized model fitted by nonlinear least squares generates adequate results that are scaled to the plots’ productivity, since the inclusion of dominant height into the model helps to predict height locally.


INTRODUCTION
Tree height measurement is one of the most difficult, time consuming and expensive activities in forest inventories data gathering (Ribeiro et al. 2010, Sharma andBreidenbach 2015); although it is a fundamental variable to support forest management since growth and yield models rely on accurate estimates of total tree height (Sharma 2016).In many situations, foresters save time and effort by measuring just a few trees inside the plot and predicting the other tree heights using a mathematical equation, highlighting the importance and widespread application of these models in forestry (Salas et al. 2010).
The height-diameter relationship (also known as hypsometric relationship) can be classified as local and regional/generalized (Trincado andLeal 2006, Paulo et al. 2011).Local models predict height variation using only one variable, commonly the diameter and are, thus, fitted by plot or stand.Regional models add, besides diameter, other stand variables that help explain height variation (e.g.dominant height, basal area, age) or even inclusion of random effects models and dummy variables for ecoregion (Temesgen et al. 2014).Generalized models are fitted with larger databases than the ones used for local models and consequently are able to predict heights in diverse stand conditions.Huang et al. (2009), Robinson and Hamann (2011), and Zang et al. (2016), among others, report that it is very common for data from natural resources to violate basic statistical assumptions for application of regression analysis via ordinary least squares method (OLS).This is because biological data may have temporal, spatial and hierarchical structure with dependence between observations, leading to biased estimate of the parameters' estimated variance.In mixed models, random effects are introduced in the model coefficients at different levels, such as region, site, plot and tree (Ou et al. 2015, Zang et al. 2016) to represent such dependences.Thus, studies using mixed-effects modeling have shown significant gains to predict tree height (Calama and Montero 2004, Shawn et al. 2009, Paulo et al. 2011).
African mahogany (Khaya ivorensis A. Chev.) cultivation is recent outside the species' countries of origin.In Brazil, the activity began in 1976 in the state of Pará and since then its cultivation has been an alternative to the native mahogany (Swietenia macrophylla G. King.) exploitation, since it is in threat of extinction (França et al. 2016) and restricted in homogeneous plantations due to attacks by the Meliaceae shoot borer Hypsipyla grandella (Zeller) (Lepidoptera: Pyralidae).
Due to the recent domestication of the species, few studies are conducted related to tree height prediction, es-pecially in Brazil (Silva et al. 2016, Ribeiro et al. 2017).Therefore, the objective of this paper is to compare different fitting strategies to predict tree height in African mahogany Brazilian plantations using well know local and regional models fitted by: i) nonlinear least squares; ii) mixed-effects and iii) mixed-effects with correction of heteroscedasticity modelled by power-variance function.As a hypothesis, we expect that the modelling approach that most details the estimated values residual errors (i.e.mixed-effects with correction of heteroscedasticity modelled by a power-variance function) will yield the best results; also, we believe that regional models will provide a simple means for users to apply the models generated here to their own databases.

METHODS
The data from the African mahogany plantations used in this research had similar forest management and genetic bases (Ribeiro et al. 2017); with age ranged from 2 to 14 years old and at least two re-measurements (table 1).The stands have most of soils classes belonging to latossoil (Ribeiro 2017).In the state of Minas Gerais, four distinct types of climate predominate: Cwb, Cwa, Aw and BSw, according to Köppen climatic classification, with annual precipitations between 750 mm and 1,800 mm.In the state of Goias the climate is tropical with dry season in the winter (Aw) and in the state of Pará the climate is tropical humid or superhumid (Af) or subhumid (Am) (Embrapa 2016).
N N 0 500 1000 1500 2000 km Dominant height (hdom) was defined as being the mean height of the 30 thickest trees per hectare (Ribeiro et al. 2016), and not of the 100 thickest trees per hectare, as usual, due to the low density of the plantations in Brazil (spacing used at most plantations is 6 x 6 m, and 12 x 12 m for the oldest).Similar methodology was adopted by Paulo et al. (2011) for Quercus suber studies in Portugal and Danquah (2012) for African mahogany species planted in Ghana, where they selected 25 and 40 thickest trees per hectare, respectively, to determinate mean dominant height.The same procedure was done to determine dominant diameter (ddom), defined as being the mean diameter of the 30 thickest trees per hectare.We did not consider the effect of plot size and tree spatial distribution on dominant height and diameter estimation, though we expect any possible bias to arise from this to be neglectable.García and Batho (2005) reported mean bias values of 42 cm, given that the stands are homogeneous, and this effect is expected to be more important in more variable stands (García 1998).
A graphic of analysis of the data was performed for detection and exclusion of extreme observations, attributed to measurement errors, trees that were dead, damaged and presenting a broken top or trunk.A summary of the descriptive statistics of the data set used in this study is presented in table 2.
Model fitting and selection.In general terms, the regression analysis aims at representing the distribution of a response variable (Y) subject to values of a predictor variable of known values (X), f(Y|X 1 , …, X i ), as shown in [1].[1] where: Y t represents the response variable, X t represents the predictor variable, ε t represents the error term at the tth data point, and θ is composed by a vector of the model's parameters.
Several models are used to represent the relationship between height and diameter in forest data and studies have emphasized the superiority of non-linear models (Huang et al. 2009, Mehtätalo et al. 2015).The mathematical expressions tested in this study are presented in table 3. A 1.3 constant was used to avoid the prediction of a height inferior to 1.3 meters when d is small.(n = 4,201), where SD = standard deviation.

Model
In table 3, models 1 to 8 are considered local, in which the height variation is estimated only based in the tree's diameter.Models 1 to 8 are traditionally used to describe the height-diameter relationship and were obtained from Mehtätalo et al. (2015).Models 9 to 15 are considered regional and insert additional stand variables to describe height variation.Models 9 to 15 were obtained from Scolforo (1990).
We compared local and regional models separately in this work.For each group of models (local and regional), a three-step fitting strategy was followed.The first step consisted of model fitting without specifying any random effects, fitting a basic model by nonlinear least squares (NLS) techniques.All statistical inferences were made using the program R (R Core Team 2016) with the nls function performing a nonlinear regression analysis via Gauss Newton algorithm.
The second step (NLME) involved inclusion of random effects in the coefficients of the best models chosen in step 1, initially inserting random effects in all the coefficients of the models, as suggested by Pinheiro and Bates (2000), using the nlme package (Pinheiro et al. 2016)  Coefficient estimation was based on the maximum likelihood and comparison of nested models tests were made based on the likelihood ratio (random part) and conditional F tests (fixed part).When mixed models are used, the goal is to predict values for Y from a continuous predictor variable X and add a categorical variable for each stipulated group.Following Calama and Monteiro (2004), Sharma and Parton (2007) and Pinheiro and Bates (2000), a general expression for a nonlinear mixed-effects model can be defined as [2].[2] where: Y ij is the jth observation (tree) of the response variable taken from the ith sampling unit (plot), X ij is the jth measurement from the predictor variable at the ith plot, θ i is a parameter vector, specific for each sampling unit, f is a nonlinear function of the predictor variables and the parameter vector, and ε ij is the residual error term.In vector form, this mixed-effects model can be expressed as [3].[3] where: Y i is the observation vector and X i is the predictor vector for the ith plot, ε i is a vector of the residuals, β is a vector of fixed population parameters, b i is the vector of random-effects associated with the ith plot and A i and Z i are design matrices for the fixed-and random-effects specific to each plot, respectively.
We used as a random effect the combination of the local, plot and measurement occasion, totaling 380 groups.More detailed statistical notation and explanation of mixed modeling process can be found in Pinheiro and Bates (2000), Robinson and Hamann (2011) and Mehtätalo et al. (2015).
The third step (WNLME) was made when we verified violations of assumption of constant variance (homoscedasticity) in steps 1 and 2. If the plot of the standardized residuals versus the fitted values showed that the error variance was heterogeneous, a variance power function (Var (ε)= σ 2 (d) 2δ ) was used to model the variance error structure within groups using covariates with an exponential parameter delta (δ) estimated by iterative processes.We chose initial δ value = 0.5, which implies a linear relationship between the variance and tree diameter (Mehtätalo et al. 2015).A similar procedure was performed by Paulo et al. (2011) and Mehtätalo et al. (2015) to fit height-diameter models.
The models were chosen according to the goodness-offit, predictive ability, biological sense (e.g.positive height values, larger heights for larger diameter trees) and fulfillment of the error assumptions (homoscedasticity, lack of autocorrelation and normality of residuals).The goodnessof-fit of the functions was analyzed through the root mean squared error (RMSE,[4]) and value of the Akaike information criterion (AIC). [4] Where Y = observed response variable, Ŷ = estimated response variable and n = number of observations .

RESULTS
Local models.Statistical criteria and visual plot analysis of residuals versus fitted values for each fitted model (table 4) showed that the models 1, 4 and 7 had the best results considering the local models fitted by nonlinear regression using least squares method (NLS).

𝛽𝛽 ̂ 𝛽𝛽 ̂ 𝛽𝛽 ̂
The goodness-of-fit criteria for all equations were similar (table 4), with a slight superiority for model 7, followed by models 4, 8 and 1. Model 8 presented more coefficients than those shown by the others and was discarded in favor of a more parsimonious model.The residual plot for model 4 was biased, overestimating the predicted heights below 5 meters and all models showed trends of non-normality for higher values of prediction (figure 2).
The distribution of residuals for models 1 and 7 was similar, as was their goodness-of-fit, being model 1 chosen for the other two fitting strategies, since it has less parameters, and presented better fit for the higher height values (larger than 25 m).Proceeding to the second step of the fitting process, model 1 was fitted as a mixed-effects model with random effect inserted in all coefficients [5].
The coefficients estimated for model 1 with the NLME and WNLME fitting strategies, the variance estimates for the random effects in the mixed model and the statistical criteria are presented in table 5.
The residual plots (figure 3) show a tendency of heterogeneity of the variance for the NLME method with inclusion of random effects on the parameters, and this was corrected when using a power type variance function (WNLME) into the regression.Normality was not guaranteed for the extreme values of height prediction for both methodologies (figure 3).

Generalized models.
As for the local model fitting, generalized models were also first fit using the NLS method without hierarchy, resulting in the following best models: 10, 11 and 13 (table 6) and residuals plot shown in figure 4.
We chose the more parsimonious model (model 10) for fitting of the two other fitting strategies.The inclusion of the random effects in the model coefficient presented the best results, resulting in the following final model [6]: [6] The coefficients estimated for model 10 with the NLME and WNLME fitting strategies, the variance parameters for the random effects in the mixed model and the statistical criteria are presented in table 7.
When fitting [6] with the inclusion of random effects, a minor improvement in the statistics used as selection criteria was observed, although the residual distribution presented similarity to the model without hierarchy (figure 4), with a slight bias for height prediction for trees under 5 meters (figure 5).Valores residuales versus valores ajustados y Q-Q normal gráfico para los mejores modelos generalizados ajustados.
Table 7.Estimated coefficients and random effects, variance components for random effects for model 10, where SE = standard error and SD = standard deviation.

DISCUSSION
The main objective of the present study was to develop equations that adequately predicted height for African mahogany stands in Brazil, respecting statistical assumptions and parsimony.While some studies have reported growth parameters and wood quality for Khaya ivorensis plantations considering limited stand variations (e.g.Silva et al. 2016, Ribeiro et al. 2016), to our knowledge, this study is the first to provide height-diameter models for a large scale database in Brazil and beyond.Care must be taken when applying the models outside the sampled database range (for other parts of the world or for ages over 14 years), especially considering the peculiarities of Brazilian African mahogany silviculture (intensive management practices and wide spacing).
It is expected that a model including stand variables (i.e.generalized models) provide a better predictive equation for height, as noted by Trincado and Leal (2006).It was clear in this work that when the models were fitted by NLS method, a predictive improvement of a half meter error comparing local model 1 with generalized model 10 occurred, besides the lowest AIC value for the last equation.
Mixed-effects modeling is one alternative to deal with correlated observations, in which the variability between the sampling units can be explained by including random effects, which are estimated at the same time as the model coefficients (Calama and Montero 2004).Sharma and Parton, (2007) developing a h-d equation for species of boreal forest in Canada, found that the inclusion of random effects on the selected model coefficients resulted in a lower value of AIC and an improvement in the models' predictive ability.Temesgen et al. (2014), fitting the ratio of height and diameter for various species in China, used the mixed-effects method to correct the hierarchical data structure and generate a robust predictive equation.Our results confirmed this trend, where the mixed-effects models provided better results compared to the NLS techniques.
For all selected models non-normality for extreme values occurred.Zang et al. (2016) affirmed that since the height-diameter relationship is influenced by numerous factors, it may be difficult to model using normal parametric models and limitation of least-squares methods (e.g.normally distributed errors) may present problems, especially in the case of generalized h-d equations.Crecente-Campo et al. (2010) suggest the use of weighting factors to balance error variance, to account for non-normality and to take into account unequal selection probabilities.Although the impact of the weighting procedure was minimal in their work, the parameter estimates and approximate standard errors showed the same magnitude, the goodness-of-fit statistics was also similar, with slightly better values for the model fitted using unequal selection probabilities.
For the selected generalized model (model 10), the inclusion of a random effect did not result in explicit improvement of residual distribution (figure 5), with slight improvement on statistics values (table 7) compared with the NLS method.That was expected since the inclusion of a stand variable into the model works as a plot level control, improving the predictions in local scale.The small effect of the random component for the generalized model was confirmed by its low value of standard deviation, 0.04 m, compared with a residual standard deviation of 1.14 m for the NLME fit (table 7).The random component's standard deviation values were much more expressive for the local models, reaching 0.61 m, compared with the residual standard deviation of 1.14 m for the NLME fit (table 5).However, when heteroscedasticity was corrected, residuals were less biased and the values of the statistics were higher than those for the other fitting strategies.The relationship defined between the standardized residuals and the tree height estimates did not suggest the presence of heteroscedasticity associated with the error term for WNLME approach in the local and generalized model selected, although non-normality still existed.Calegario et al. (2005), estimating the height growth of clonal Eucalyptus trees, obtained significant gains when modeling the heterogeneity of variance, where the distribution of residues was significantly improved.We also arrived at the same conclusion when we applied the variance power function on the selected models.
In the present study, the gain in the use of a generalized model (using dominant height) compared with local models including random effects on the parameters was not very significant.It is known that the dominant height is a variable that reflects local productivity, being correlated with the total height of the trees; hence, the inclusion of the same in hypsometric designs results in improvement of height predictions.
We plotted the different height-diameter selected models in varying silvicultural scenarios (one older plot with large spacing and one younger plot with dense spacing) to illustrate the models' predictive ability, and also to highlight the errors of selecting equations from literature without any calibration (figure 6).In figure 6 we see that the most general model from our study (Model 1_NLS) is able to adequately describe the mean behavior of the height-diameter relationship, nonetheless it is not able to distinguish between sites showing different productivity.Considering the strategies that take into account the plots' productivity (Model 10 (gen.)NLS and Model 1 (local) WNLME), we see that they are able to better describe particular differences in the height-diameter relationship in different plots (represented as black dots in figure 6), with superiority for the mixed-effects approach.
Thus, the authors suggest that when a large-scale database is available to fit height-diameter models for Khaya ivorensis, model 1 using random location parameters and correction of the heteroscedasticity of the residuals should Diferentes predicciones de altura para el conjunto de datos de caoba africano, donde los puntos grises muestran los datos originales y los puntos oscuros muestran las alturas observadas para dos parcelas.
be the preferred method to estimate the height of unsampled trees.However, model users that do not have these data can use the generalized model (model 10), since inclusion of the dominant height into the model helps to predict height locally.For Salas et al. (2016), in general, models are good for the purpose they were built for and it is very difficult to find a model that works well for all purposes.
Finally, to illustrate the need for calibration of the height-diameter relationship and to adhere to the range of the sampled database, we also plotted a model from Silva et al. (2016) in figure 6.These authors presented a local equation fitted with NLS method to predict height in 4-year-old African mahogany stands geographically close to some of the stands used in this study.Here we see that while Silva's model predicted the height variation reasonably well for trees with diameters with 10 to 20 cm, it provided large underestimation outside this range, especially for larger trees (d > 40 cm).

CONCLUSIONS
We concluded that the modelling approach that most details the estimated height values residual errors (mixed-effects considering each plot measurement occasion as the random effect and with correction of heteroscedasticity modelled by power-variance function) yielded the best results.However, given the fact that users will not always have a large database at their disposal, the generalized model 10 fitted using the plot dominant height can be applied to successfully estimate tree height of other Khaya ivorensis stands.

Figure 1 .
Figure 1.Stands location (represented by a star, A) and scatterplot between total tree height (h) and diameter at breast height (d) (n = 4,201, B).

Figure 2 .
Figure 2. Residual versus fitted values and normal Q-Q plot for the best local fitted models.Valores residuales versus valores ajustados y Q-Q normal gráfico para los mejores modelos locales ajustados. ℎ

Figure 4 .
Figure 4. Residual versus fitted values and normal Q-Q plot for the best generalized models fitted.

Figure 5 .
Figure 5. Graphical relationship between the standardized residuals and fitted values for model 10 with different fitting strategies.

Figure 6 .
Figure 6.Different height predictions for African mahogany dataset, where the gray dots show the original data and the dark dots show the observed heights for two sample plots.

Table 1 .
Characterization of the data set.

Table 2 .
Summary of statistics for the African mahogany plantations hypsometric data set

Table 3 .
Functions applied for height estimation*.
of R.

Table 4 .
Coefficients and goodness-of-fit of the local height diameter functions using least squares method (NLS)*.Coeficientes y bondades de ajuste de las funciones locales de altura y diámetro utilizando el método de los mínimos cuadrados (NLS)*.
*All estimated parameters are significant at P level < 0.05.

Table 5 .
Estimated model coefficients and random effects, variance components for the random effects for Model 1, where SE = standard error and SD = standard deviation.Estimativa de los coeficientes de lo modelo, efectos aleatorios, componentes de varianza para los efectos aleatorios para el modelo 1, donde SE = error estándar y SD = desviación estándar.
Figure 3. Residual versus fitted values and normal Q-Q plot for model 1 with different fitting strategies.Valores residuales versus valores ajustados y Q-Q normal gráfico para el modelo 1 con diferentes estrategias de ajuste.

Table 6 .
Coefficients and goodness-of-fit of the generalized height diameter functions*.Coeficientes y bondades de ajuste de las funciones generalizadas de altura y diámetro*.ns = estimated parameters not significant at P level > 0.05. *