Height to Crown Base Modelling for the Main Tree Species in an Even-Aged Pedunculate Oak Forest : A Case Study from Central Croatia

(1) föra forest technologies SLL, Campus Duques de Soria s/n, ES-42004 Soria, Spain; (2) Universidade de Santiago de Compostela, Departamento de Enxeñaría Agroforestal, Biodiversity-LaboraTe-IBADER, ES-27001 Lugo, Spain; (3) Croatian Forest Research Institute, Division for Forest Management and Forestry Economics, Trnjanska cesta 35, HR-10000 Zagreb, Croatia; (4) Sustainable Forest Management Research Institute, University of Valladolid-INIA, Campus Duques de Soria s/n, ES-42004 Soria, Spain; (5) Croatian Forest Research Institute, Research Centre for Urban and Private Forests, Vilka Novaka 50c, HR-42000 Varaždin, Croatia


INTRODUCTION
Tree crowns are involved in many key physiological processes, such as photosynthesis (Ma et al. 2017, Liu et al. 2019, respiration (Plain et al. 2009, Drake et al. 2016, and transpiration (Köstner et al. 1992, Gupta et al. 2018, and they also influence soil water storage (Meinzer et al. 1999, Zhang et al. 2007) and the interception of radiation (Stenberg et al. 1994, Chen et al. 2017, being the most important part of the tree.
The crown structure also affects the biomass productivity (Valentine et al. 1994, Russell andWeiskittel 2011), stem form, and wood quality (Maguire et al. 1991, Hein et al. 2008) and individual tree competition status (Kershaw et al. 1990, Cameron et al. 2020. The crown structure includes decisive parameters to assess the crown fuel characteristics that influence potential crown fire behaviour (Cruz et al. 2003, Ruiz-González and Álvarez-González 2011, Jiménez et al. 2013.
Therefore, these metrics are critically important for informing forest management (Li et al. 2020). However, crown measurements require an appropriate abstraction of crown size and shape; however, direct measurement of the crown shape or profile is difficult (Cameron et al. 2 SEEFOR 12(1): early view 2020). In addition, the labour-intensive and time-and money-consuming endeavour of measuring descriptive crown parameters hinders the wide application to forest management (Hasenauer and Monserud 1996, Fu et al. 2017, Luo et al. 2018) particularly in mixed stands.
This has motivated many studies to focus their efforts on modelling crown parameters, such as tree height to crown base (hcb). hcb is an important variable often included as a predictor in various forest models that serve as the fundamental tools for decision-making in forestry ). hcb reports important information to assess the social position of the trees and the competition they suffer, closely related to the crown size and may affect the tree vigour (Hasenauer andMonserud 1996, Fu et al. 2013). Although traditional field-based methods can obtain reliable and accurate hcb measurements (Dean et al. 2009, S. Luo et al. 2016, it is rarely measured in the field and it is often predicted using static models (Rijal et al. 2012).
hcb models for many tree species have been developed from data at the tree and plot level using: (1) basic models methods (Rijal et al. 2012, Fu et al. 2017, Jia and Chen 2019, Pan et al. 2020) to obtain high precision models with low field inventory cost, and (2) mixed-effects models (Fu et al. 2017, Jia and Chen 2019, Pan et al. 2020, which have been widely used in recent years as they show an increase prediction accuracy to an extent (Jia andChen 2019, Pan et al. 2020).
The variability of hcb has been evaluated through three groups of variables: (1) tree size metrics referred to as treelevel variables, (2) variables of the size and vigour of the plot or stand denominated as plot-level variables, and (3) variables of the site quality and distance-independent competition indexes. For instance, the dominant height is usually used to describe the site quality and the stand development age, since it is strongly correlated with the growth condition of the dominant trees (Fu et al. 2017, Li et al. 2020, Pan et al. 2020, and the total basal area of all trees with a diameter larger than the target tree is usually used to assess competition effects (Fu et al. 2017). These variables have been used combined or separately in many studies (Rijal et al. 2012, Fu et al. 2017, Pan et al. 2020. In this work, hcb is defined as the height from the ground to the base of the first living branch that is a part of the crown, ranging from 0 to the height of the tree (Rijal et al. 2012, Fu et al. 2017). This must be differentiated from the concept of the Canopy Base Height, a parameter used in the study of forest fires, which represents the mean vertical distance, in meters, between the ground and the beginning of the living canopy at which there is enough fuel to spread the fire vertically through the canopy (Scott and Reinhardt 2001).
Although hcb is often assessed in research or monitoring (e.g., International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests (ICP Forests)) projects, in general, hcb modelling was not a research focus in Croatia. Such models were often not validated. The objectives of this study were (i) to develop species-specific hcb logistic models for the main tree species in the mixed, lowland pedunculate oak (Quercus robur L.) forests in Central Croatia; and (ii) to study which variables best explain the variability of hcb for each species and how each group of variables (tree-, plot-, and competition-distance index variables) affects the accuracy of the models. Mixed pedunculate oak forests are among the most important forest ecosystems in Croatia with exceptional ecological, economical, and sociological significance (Novotny et al. 2013). Pedunculate oak is also a key tree species of European forests (Čater 2015). Despite these facts and to the best of authors knowledge, there were no previous studies dealing with hcb modelling in pedunculate oak forest ecosystems.

Study Area
The study area is located in the continental part of Croatia, near Jastrebarsko and 35 km southwest of Zagreb, within the Pokupsko Basin lowland forest complex (≈12,000 ha, Figure 1). The forests of the study area consist of evenaged pedunculate oak stands mixed with other tree species: common hornbeam (Carpinus betulus L.), black alder (Alnus glutinosa (L.) Gaertn.), and narrow-leaved ash (Fraxinus angustifolia Vahl.). Other tree species that sporadically occur are Ulmus laevis Pall., Tilia sp., Populus sp., Acer campestre L., Betula pendula Roth., Pyrus pyraster (L.) Burgsd., etc. These stands commonly have a two-layered structure (a dominant and suppressed tree layer) with pedunculate oak in the dominant crown class layer and other tree species in the suppressed crown class layer. The understorey is dominated by two shrub species: hazel (Corylus avellana L.) and common hawthorn (Crataegus monogyna Jacq.), while alder buckthorn (Frangula alnus Mill.) occurs sporadically. The terrain of the study area is mostly flat (105-118 m a.s.l.). Soils are hydromorphic (hypogley, eugley, and pseudogley) on a clay parent material. The climate is warm temperate with the mean annual temperature of 10.6 °C and precipitation of 962 mm•y -1 (Ostrogović Sever et al. 2019).
The pedunculate oak forests of the study area are state-owned, and they are actively managed for sustained timber in 140 year-long rotations, ending with two or three regeneration fellings during the last 10 years of the rotation. This management approach provides sustainability in terms of the yield, biodiversity, and ecosystem stability (Klepac and Fabijanić 1996). It has been applied for all even-aged pedunculate oak forests in Croatia where they cover an area of 225.000 ha or 9% of the total forest area (Croatian Forests Ltd. 2016). Further detailed information on sustainable management of pedunculate oak forests in Croatia can be found in Ostrogović Sever et al. (2019).

Field Measurements
Field data were collected from a total of 164 circular sample plots during 2017. The plots were systematically distributed throughout the study area, i.e., within the 30 stands of different age classes and with a total area of 576.32 ha. The radius of plots (8, 12, and 15 m)  At each plot, the tree species and its location were recorded, and the diameters at breast height (dbh) were measured for all trees with dbh ≥ 10 cm. The position of each tree in the plot was recorded by measuring the distance and azimuth from the plot centre to each tree using a Vertex III hypsometer and Haglöf compass, respectively. Out of 4953 trees sampled within the 164 plots, the tree height was measured for 3247 trees using a Vertex III hypsometer, while the heights of the rest of the trees were estimated using the species-specific dbh-height models fitted with Michailloff's function (Michailoff 1943). In addition to the total tree height, the height from the bottom of the tree to the first live branch, i.e., the hcb, was measured for 3163 trees that were subsampled from trees with height measurements. A summary of the main forest attributes for the entire sample of 164 plots used for developing hcb models at tree-level is presented in Table 1. The plot-level volume values in Table 1 refers to the merchantable tree volume comprising of stemwood and branchwood up to a diameter of 7 cm overbark. It was calculated for each sampled tree using the Schumacher-Hall equation (Schumacher and Hall 1933) and parameters from Croatian two-entry (dbh and height) volume tables (Špiranec 1975, Cestar and Kovačić 1981, 1982, and summarized at plot-level.

Development and Validation of Tree Height to Crown Base (hcb) Models
Species-specific hcb models were developed for all species present in the dataset consisting of 164 sample plots. A model was adjusted for all species that had at least 20 trees measured in the field. The models were fitted with the outliers removed (Montgomery 2009 where ht is the total height of the tree, a i are model parameters to be estimated, and x i are plot-, tree-level, or competition explanatory variables ( Table 2). The hcb values in this equation are constrained to be less than the total height and greater or equal to zero (Ritchie and   (Ritchie and Hann 1987). In addition to field-measured tree-level variables (e.g., the dbh and total tree height), various plot-level metrics and distance-independent competition indexes (Table 2) were calculated from field data and considered in hcb modelling. A total of 17 potential explanatory variables were included in the statistical modelling.
Variable selection was performed in R 3.6.2 (R Core Team 2019) through a stepwise procedure with the linearized form of Equation 1 using the step function from the stats package (R Core Team 2019). This function eliminates predictors one at a time, until it finds the optimum Akaike Information Criterion (AIC) value. Next, the selected independent variables were included in the original form of the equation, and the model was fitted by nonlinear least squares with the 'nls' function from the stats package (R Core Team 2019).
The models were selected and inspected graphically and evaluated numerically through the root mean squared error (RMSE), both absolute and relative; mean error (BIAS) in percentage and its p-value to assess whether it is significantly different from 0; EF: the model efficiency in percentage; and coefficient of correlation (R 2 ) of the observed versus predicted results in percentage. Finally, to avoid overfitting, the 'vif' function of the package car (Fox and Weisberg 2019) was applied to select the most parsimonious model.
Leave-one-out cross-validation (LOOCV) was used to assess the predictive value of each regression model (Picard andCook 1984, Varma andSimon 2006). The LOOCV was chosen because certain tree species had a smaller number of field measurements, which disabled dividing them into modelling and validation subsets. For this purpose, the root mean squared error for cross-validation (RMSE CV ): absolute (m), and relative (%); mean error (BIAS CV ); and its p-value were used to assess whether it the LOOCV significantly different from 0.

RESULTS AND DISCUSSION
The species-specific hcb models (Table 3) were developed for seven tree species (A. campestre, A. glutinosa, C. betulus, F. angustifolia, Q. robur, Tilia sp., and U. laevis) that had at least 20 trees measured in the field. Other species (Populus sp., B. pendula, and P. pyraster), that had fewer measurements, were not included in any model. Table 3 displays a statistical summary of all the predictor variables per model. The crown base heights to be predicted varied remarkably among species, motivated by their different morphologies and bioecologies, with the minimum inventoried being 0.30 m for A. glutinosa and the maximum of 32.10 m for Q. robur. The independent parameters that are repeated in different models varied considerably among species, which gives an idea of the great variability present in the study area. N is the variable with the highest dispersion for the two models in which it has been used as a predictor, followed by AGE and bal in the third place. The maximum ht values show how Q. robur, F. angustifolia, and C. betulus are the species that reached the highest heights and, therefore, the dominant species in the study area.
ht and dbh predictors have been used as the main predictor in many hcb modelling studies (Fu et al. 2017, Pan et al. 2020. Instead, we only used ht as the main predictor in the base model (see Equation 1), and dbh did not enter any of the evaluated models. The hcb models used ht plot-level variables with tree-level variables and distance-independent competition indexes. Table 4 incorporates the hcb model performances. The results showed accuracies between 67.55% (C. betulus) and 92.42% (U. laevis) of R 2 and p.BIAS between 0.7115 (U. laevis) and 0.9488 (F. angustifolia). The RMSE ranged from 12.02% for A. glutinosa to 34.05% for C. betulus. One of the reasons for the greater variation among the models' performances may be the intraspecific morphology diversity since some species present higher heterogeneity among their crowns than others. Namely, the number of sampled trees differ from species to species; they were collected from forest stands of different ages where certain species may have different status (e.g., dominant, co-dominant, intermediate, suppressed layers) and consequently form different crown shapes and dimensions. Thus, the C.  Table 3. Statistical summary of the dependent and predictor variables (parameters) of the hcb models. See acronyms in Table 2.
betulus hcb model clearly performed worse than the rest of the species and had the highest relative RMSE associated. In view of these results, the models developed were unbiased, and they adequately explained the variations in the height to crown base. While it is true that the models offer good results, some models have been built with small samples and should be applied carefully, particularly A. campestre (27 trees), which shows a narrow interval of stand density . Nevertheless, in the case of U. laevis (21 trees), the range of variation of both the predicted and explanatory variables is quite wide, which makes the fitted equation suitable for a broad spectrum of stands.

SEEFOR 12(1): early view
In this study, a high correlation was found among plot variables, the distance-independent competition index, and the hcb for each species. The plot variables gave rise to the most accurate models, since hcb is a dimensional measurement of a tree and since it is more strongly correlated with other dimensional measures (Rijal et al. 2012), whereas competition measures better describe competitive interactions among species (Petritan et al. 2007, Thorpe et al. 2010).
Distance-independent competition indexes originate models whose accuracies are conditioned by intra-and inter-species competition and stages of development, originating more variable precisions that depend on the intimate mixture of trees present in each plot. Several studies have shown that competition variables and stand density have a remarkable influence on hcb values (Garber et al. 2008, Russell et al. 2014. Therefore, it is not surprising that the most repeated parameters in the models were dhratio and AGE. Table 4. Regression coefficients and goodness of fit parameters for predicting the crown base height by species. Trees with measured hcb in field (M) and used in the models (U); root mean squared error (RMSE): absolute (m) and relative (%); BIAS: in percentage; p.BIAS: BIAS p-value; R2: coefficient of correlation of observed versus predicted (%), EF: Model Efficiency (%); Parameter: independent variable; Est: parameter estimates; VIF: Variance Inflation Factor of parameter estimates; p-value: significant codes: ***: 0, **: 0.001, and *: 0.01.  Pan et al. (2020) observed that additional explanatory variables can significantly improve the prediction accuracy of an hcb model; however, introducing too many variables into a model can lead to overparameterization, which can produce a biased prediction (Fu et al. 2017) and increased forest inventory costs by requiring more measurements in the field (Pan et al. 2020). There are models with a high number of variables, e.g., six for C. betulus, but the Variance Inflation Factor (VIF) remains low.
Logistic models are considered by some authors to be those that provide the most accurate predictions for many tree species (Fu et al. 2017. Rijal et al. (2012) observed that they significantly reduced bias and eliminated convergence problems.
In this study, the best accuracies were obtained for the U. laevis (21 used trees and R 2 =92.42%) and Tilia sp (139 used trees and R 2 =89.07%) models. On the other hand, the main species C. betulus (871 used trees and R 2 =67.55%) and Q. robur (1538 used trees and R 2 =80.57%) models had the lowest accuracies. These two species appeared in both the main and secondary vegetation layers, increasing the variability of hcb, which explains the low precision of their models. An alternative to reduce the variability of the hcb models would be to create models by species for each layer considered.
As expected, the correspondence between the field hcb and estimated hcb increased with outlier removal in all models in which outliers were detected. No outliers were found in A. campestre and U. laevismodels. In Tilia sp. models, only one outlier was found; two in A. glutinosa and F. angustifolia models; four for C. betulus; and 11 for Quercus robur. The larger the database, the larger the outliers found.
Our accuracies are moderately higher than those found in similar investigations. Rijal et al. (2012) evaluated the hcb in thirty species of the North American Acadian Region using a logistic equation with plot-level variables and competition variables, obtaining performances ranging from 47% to 79% of R 2 . Fu et al. (2017) studied the hcb of Mongolian oak natural forest located in the Northeast China. They obtained accuracies of 49.72% of R 2 with logistic models using ht and dbh as the only independent variables. Sharma et al. (2017) applied logistic models to estimate the hcb for Picea abies and Fagus sylvatica in the Czech Republic, with 68% and 67% explained variances, respectively. They employed tree-level (ht), plot-level (Ho), and competition variables (the bal and basal area proportion of a species of interest) in their models. Jia and Chen (2019) applied different methods to Larix olensis hcb modelling in Northeast China. Among them, the logistic model offered an accuracy of 73.04% R 2 using plot-level variables. Pan et al. (2020) studied the hcb of L. olensis in Northeast China, and they obtained 70.63% of R 2 adj applying logistic models. As final variables, tree size measures and the spacing index were selected. Li et al. (2020) worked with Pinus massoniana in Southern China, obtaining an 88.03% R 2 adj by applying logistic equations in hcb modelling with ht and dbh as predictive variables. Figure 2 represents the hcb residual distribution of fit plots against the predictions obtained. The distribution of the errors is homogeneous throughout all the predictions, thus, ruling out the presence of heteroscedasticity in the models.
Validation results of the eight hcb models are shown in Table 5. After LOOCV, it can be seen how the values of the validation statistics increased slightly compared to those obtained in the fitting procedure, as in Fu et al. (2017), demonstrating the robustness of the models. None of them presented significant bias, and the RMSE ranged from 12.27% for A. glutinosa to 34.24% for C. betulus. The most abundant species (Q. robur) showed an error of 17.04%.ercentage; and p.BIAS CV : BIAS p-value.

CONCLUSIONS
In this work, seven main species without previous hcb models were fitted. These proposed hcb models have high importance in their novelty, but foremost by their management implications. hcb is a tree structural measure and is, therefore, related to other structural variables. However, it is a characteristic strongly conditioned by the competence and social position of each tree. This makes it essential to consider the factors that describe the competition to which each tree is subjected. Therefore, the combination of tree and plot-level variables with the site   index significantly contributed to the prediction accuracy of the proposed models. The distance-independent competition indexes explained the inter-and intraspecific critical relationships among trees to predict the hcb. These logistic models demonstrated their great versatility as they showed good precision in a wide range of species. The leave-one-out cross-validation further confirmed the high robustness of our logistic models.

Author Contributions
Conceptualization: RAP; methodology: RAP, SMG; processing: SMG; validation: SMG, RAP, IB; formal analysis: RAP, IB; investigation: SMG; resources and data curation: IB, LJ, IL; writing the original draft: SMG, IB, KI; review and editing the manuscript: RAP, IB; supervision: RAP; research funding: IB, IL. All authors have read and agreed to the published version of the manuscript.

Funding
This research was funded by the Department of Economy, Industry and Competitiveness, Spanish Government (DI-16-08446, Saray Martín-García), by the European Commission through the project "MySustainableForest" (H2020-EO-2017; 776045) and by the Croatian Science Foundation under the project IP-2016-06-7686 "Retrieval of Information from Different Optical 3D Remote Sensing Sources for use in Forest Inventory (3D-FORINVENT)". The work of doctoral student Luka Jurjević has been fully supported by the "Young researchers' career development project -training of doctoral students" of the Croatian Science Foundation funded by the European Union from the European Social Fund.