- Download PDF-
SEEFOR 10 (2): 137-144
Article ID: 274
ORIGINAL SCIENTIFIC PAPER
Modelling Stand Variables of Beech Coppice Forest Using Spectral Sentinel-2A Data and the Machine Learning Approach
Azra Čabaravdić1*, Besim Balić1
(1) University of Sarajevo, Faculty of Forestry, Zagrebačka 20, BA-71000 Sarajevo, Bosnia and Herzegovina
* Correspondence: e-mail: firstname.lastname@example.org
Citation: ČABARAVDIĆ A, BALIĆ B 2019 Modelling Stand Variables of Beech Coppice Forest Using Spectral Sentinel-2A Data and the Machine Learning Approach. South-east Eur for 10 (2): 137-144. DOI: https://doi.org/10.15177/seefor.19-21
Received: 16 Jul 2019; Revised: 24 Sep 2019; Accepted: 3 Oct 2019; Published online: 3 Nov 2019
Background and Purpose: Coppice forests have a particular socio-economic and ecological role in forestry and environmental management. Their production sustainability and spatial stability become imperative for forestry sector as well as for local and global communities. Recently, integrated forest inventory and remotely sensed data analysed with non-parametrical statistical methods have enabled more detailed insight into forest structural characteristics. The aim of this research was to estimate forest attributes of beech coppice forest stands in the Sarajevo Canton through the integration of inventory and Sentinel S2A satellite data using machine learning methods.
Materials and Methods: Basal area, mean stand diameter, growing stock and total volume data were determined from the forest inventory designed for represented stands of coppice forests. Spectral data were collected from bands of Sentinel S2A satellite image, vegetation indices (difference, normalized difference and ratio vegetation index) and biophysical variables (fraction of absorbed photosynthetically active radiation, leaf area index, fraction of vegetation cover, chlorophyll content in the leaf and canopy water content). Machine learning rule-based M5 model tree (M5P) and random forest (RF) methods were used for forest attribute estimation. Predictor subset selection was based on wrapping assuming M5P and RF learning schemes. Models were developed on training data subsets (402 sample plots) and evaluations were performed on validation data subsets (207 sample plots). Performance of the models was evaluated by the percentage of the root mean squared error over the mean value (rRMSE) and the square of the correlation coefficient between the observed and estimated stand variables.
Results and Conclusions: Predictor subset selection resulted in a varied number of predictors for forest attributes and methods with their larger contribution in RF (between 8 and 11). Spectral biophysical variables dominated in subsets. The RF resulted in smaller errors for training sets for all attributes than M5P, while both methods delivered very high errors for validation sets (rRMSE above 50%). The lowest rRMSE of 50% was obtained for stand basal area. The observed variability explained by the M5P and RF models in training subsets was about 30% and 95% respectively, but those values were lower in test subsets (below 12%) but still significant. Differences of the sample and modelled forest attribute means were not significant, while modelled variability for all forest attributes was significantly lower (p<0.01). It seems that additional information is needed to increase prediction accuracy, so stand information (management classes, site class, soil type, canopy closure and others), new sampling strategy and new spectral products could be integrated and examined in further more complex modelling of forest attributes.
Keywords: Coppice Forest, Inventory Data, Spectral Biophysical Variables, M5 Model Tree, Random Forest Regression
Coppice forests have multiple roles related to their production in forest management, as well as their social, ecological and economic importance for local communities. Their contribution is recognized and emphasized in rural livelihoods, low-carbon bio-economy, in protective functions, sharing economy, provision and enrichment .
Several studies were conducted analyzing structure, functions, silvicultural measures and other aspects in coppice forests in Europe and the Balkan region [1-8]. Authors [3-8] from South Eastern Europe (SEE) concluded that degradation and inappropriate treatments in high forests in the 20th century resulted in degradation and appearance of coppice forests on larger areas. Stajić et al.  described past and recent coppice forest management in some regions of SEE in relation to their characteristics. Višnjić et al.  investigated ecological and silvicultural characteristics of coppice forests in Bosnia and Herzegovina (B&H). In B&H coppice forests occupy around 23% of the forested area according to data from the second national forest inventory. Different silvicultural treatments (conversion, thinning, reforestation and others) for the improvement of their production and other forest functions were examined and analyzed [9, 10]. Recent intensive studies of coppice forests were conducted in the Sarajevo Canton [11-14]. Balić  presented the research on productivity, structural characteristics and models of growth and increment of coppice beech forests based on forest inventory data using statistical parametrical approach in the Sarajevo Canton.
For management planning purposes it is important to estimate stand productivity variables (basal area, stand diameter, wood volume, growing stock and others) and their spatial distributions, especially where different management regimes are recommended. Therefore, apart from forest inventory data, forest management planning should consider all available information about the forest status and stand conditions. Available remote sensing data from different satellite programs compiled with forest inventory have been used as a source for additional research about forest characteristics since the middle of the 20th century. Landsat and Sentinel satellite images have been used most frequently for forest type classification [15-17], as well as for the estimation of forest productivity attributes [18, 19]. Rapid information technology development resulted in continuous improvements of remote sensing capabilities (satellite and aerial imagery, lidar), offering innovative possibilities of research on forest vegetation [19-21]. Then statistical classification and estimation methods supported with information technology development become more efficient and promising in spatial characterization of forest attributes on the forested area [22-24]. Recently, high forests and artificial stands were analyzed frequently using machine learning rule-based approach. Therefore research focus was re-directed on coppice forests where wide interest for further coppice forest characterization was obtained.
The aim of this paper is to evaluate beech coppice forest stand variable estimates based on machine learning rule-based methods: M5 model tree and random forest regression using inventory and Sentinel S2A spectral data.
MATERIALS AND METHODS
The study was conducted in the Sarajevo Canton (about 1277 km2), which is bounded by the southern geographical latitudes 43°53'-43°47' and the eastern geographical longitudes 18°16'-18°27’ in central Bosnia and Herzegovina (Figure 1). Forest stands of state-owned beech (Fagus silvatica L.) coppice forests surrounding the capital city of Sarajevo were selected as study areas. The selected beech coppice stands are situated on plane and hilly positions at altitude range of 550 to 1700 meters, but mostly below 1000 meters (about 60%). About 80% of forest stands are situated on humid expositions with deeper and moist soils. More than 65% of forest stands are located on a position with an inclination above 20o, while less than 15% is on planes. The study area is influenced by moderate continental climate with subalpine character at higher altitudes.
Field measurements were acquired for geo-referenced field plots located at the intersection of 200×200 m grid. Trees with diameter of the breast height of 7 cm were selected in circular plots with different radii based on the probability proportional to size . The most important forest stand attributes including the basal area, stand mean diameter, total volume and growing stock were calculated and used in this research (Table 1). Tree volume for individual trees was calculated using regression models  and then scaled to a per unit area basis (m3·ha-1). In this research 609 sample plots in 185 stands were used for modelling. Descriptive statistics of forest attributes and rank correlations with predictor variables were calculated for the sample dataset.
Sentinel S2A Data
One cloud-free Sentinel-2 scene acquired on 17th October 2018 was used in this study. The spectral data were obtained from the Copernicus Open Access Hub  as Level-1C data with Top of Atmosphere (TOA) reflectance. Characteristics of the spectral bands of Sentinel-2 MSI (Multi-Spectral Instrument) sensor and subset of used bands are presented in Table 2 .
The atmospheric correction of Level-1C input data was performed using the Sen2Cor plug-in for Sentinel-2 Toolbox and SNAP software provided by ESA (version 6.0.0, Brockmann Consult, Geesthacht, Germany). Corrected data were resampled on 20 m resolution, and vegetation indices and biophysical variables were calculated.
Then, three spectral vegetation indices were calculated: difference vegetation index (DVI), ratio vegetation index (RVI) , and normalized difference vegetation index (NDVI) . In addition, the biophysical variables were calculated in SNAP from its biophysical processor, which uses a neural network algorithm based on the PROSPECT+SAIL (PROSAIL) radiative transfer model . Five biophysical variables were determined: fraction of absorbed photosynthetically active radiation (fapar), leaf area index (LAI), fraction of vegetation cover (FCOVER), chlorophyll content in the leaf (CHC), and canopy water content (CWC).
Machine Learning Algorithms
Machine learning approach refers to analytical model building automatically learning from data itself. Here two different machine learning-based rules algorithms for regression were applied: M5P and RF. M5P is a machine learning technique introduced as reconstruction of Quinlan’s M5 algorithm for tree-based regression modelling . It creates decision tree with linear regression function at the nodes using splitting criterion that minimizes the intra-subset variation. The RF regression model is an ensemble of tree predictors constructed from bootstrapping training data. For both algorithms parameters tuning is related to the number of regression trees and the number of features (explanatory variables). Here default rules for the number of trees in Weka software were applied . Important influence on the results of the applied rule-based algorithms has the feature selection. Here the ”wrapper method was used, which selects a set of features most suitable for a particular algorithm. Datasets were separated in reference (66%) and validation (33%) subsets randomly. Accuracy assessment was evaluated using the mean square error (MAE), root mean square error (RMSE) and relative RMSE (RMSE%) calculated using the following equations:
where yi is observed forest attribute of the data i, ŷi is estimated forest attribute of i, n is the number of validation data and y⁻ is the mean of the observed forest attribute. Then, determination was used to examine relationships between observed and estimated values.
The finalized machine learning models were used to make predictions for measured and non-measured geo-positions on pixel level in the study area. Input data were extracted from raster layers for each pixel geo-positioned on determined x and y coordinates.
Described method was applied for forest attributes estimates based on inventory and Sentinel S2A spectral data in similar studies [19, 22, 23].
RESULTS AND DISCUSSION
Spearman’s rank correlation between forest attributes, spectral data, vegetation indices, biophysical variables and altitude is shown in Table 3. All forest attributes are correlated significantly to the most auxiliary variables. All forest variables are correlated significantly to B2 (blue), B3 (green), B5, B6, B7 (three vegetation red edges), B8 (near infrared), B8A (narrow near-infrared band), all vegetation indices (DVI, NDVI and RVI) and a set of four biophysical variables (LAI, fapar, FCOVER and CWC). Shortwave infra-red bands B11 and B12 have low but significant correlation with the total volume and growing stock only.
Growing stock was correlated significantly to all auxiliary variables achieving highest correlation with vegetation red edge B6 (-0.24). All correlations were very low, pointing out to weak correlations in general. Astola et al.  reported higher correlations between V0, Dg, BA in boreal broadleaved forests and Sentinel S2A digital numbers (-0.74, -0.75 and -0.69, respectively).
Predictor selection based on wrapping method resulted in subsets presented in Figure 2. The number of selected predictors varied between four and ten per forest attributes. Vegetation indices and biophysical variables were selected more frequently then the original spectral data.
Original spectral bands participated in smaller numbers than in similar research related to regression tree modelling for boreal broadleaved forests .
The differences of sample and modelled forest attribute means were not significant, while modelled variability for all forest attributes was significantly lower (p<0.01). Model evaluations for reference and validation subsets are presented in Table 4. Relative RMSEs for M5P in reference sets ranged from 47.4% for G to 63.1% for GS, while values for RF varied from 17.9% to 23.6% for the same forest attributes respectively. Higher relative RMSEs for both algorithms were obtained for validations sets and ranged between 51% and 68% approximately. Higher relative RMSEs related to regression tree modelling were found for G, Dg and V0 in boreal broadleaved forests .
The RF performed better in a reference set for all forest attributes related to correlations between the observed and predicted values, while correlations in validation subsets were higher for M5P for all attributes (Figure 2).
The presence of systematic errors was obtained for both algorithms for all attributes in a consistent way. Figure 3 presents the relationships between the observed and predicted values in reference and validation sets for basal area that are similar for other forest attributes. The better performance of RF predictions is visible for RF in the reference set, while predictions in validation sets were biased and weak for both algorithms in the range of observed basal area values.
The over-fitting in reference sets was obtained in all models with weak adjustment in validation sets. The RF predictions on the measured location achieved reliable values, while surrounding pixel-based estimates deviated from ground truth. It seems that the chosen feature selection method and the algorithm’s specifications express low performances in validation sets, so further research related to reliable estimates on the pixel-level is needed.
Mapping of Basal Area for Beech Coppice Stands
The mapping of forest attributes has become a contributing part for forest management on all forested areas [23, 34]. Recent research of coppice forests [12-14, 35] pointed out the importance of spatial distribution of forest production attributes for the planning of ameliorative measures (restoration, reforestation) especially in beech stands.
We found that the applied machine learning-based estimation mapping could give insight into spatial distribution of forest attributes with better preservation of ranges of ground and RF estimated values.
Here is a visualized spatial distribution modelled by RF for basal area as an example and two details with stands from Google Satellite (above) and estimated basal area (below) (Figure 4).
Estimate of spatial distribution of highly correlated forest attributes is consistent over the forested area. This consistency could contribute to the coppice forests’ function analysis considering their productive and protective roles. Related to productivity, RF spatial estimates better indicate areas with low values of forest attributes pointing out to the adequacy of stand potentials usage. Also, machine learning-based estimates near stand boundaries indicate forest quantity coverage related to the preservation of forest soil and protection from erosion and drying. We found that these indications could contribute to forest planning considering management and silvicultural measures aiming to improve coppice forest quantity and quality potentials.
Conclusions that can be drawn from this study are:
- There are significant rank correlations between spectral Sentinel S2A data, vegetation indices, biophysical variables, altitude and the main beech coppice forest attributes (G, Dg, V0, GS).
- NDVI, LAI and altitude participated most frequently in selected variable subsets.
- Machine learning modelling based on M5P and RF resulted in different efficiency for all forest attributes. RF estimates in reference sets (RMSE% below 24%) were better than M5P estimates (RMSE% below 63%). In both modelling processes over fitting in reference sets were obtained, while estimates achieved high relative RMSEs in validation sets.
- The machine learning approach compiled with Sentinel S2A spectral data is promising for the estimation and mapping of spatial distribution of forest attributes in beech coppice stands.
Further research is needed related to machine learning algorithm specifications, more intensive and representative ground sample, spatial correlations and other scientific and technical possibilities.
- UNRAU A, BECKER G, SPINELLI R, LADZINA D, MAGAGNOTTI N, NICOLESCU VN, BUCKLEY P, BARTLETT D, KOFMAN PD (eds) 2018 Coppice forest in Europe. Albert Ludwing University of Freiburg, Freiburg, Germany
- MAIROTA P, MANETTI MC, AMORINI E, PELLERI F, TERRADURA M, FRATTEGIANI M, SAVINI P, GROHMANN F, MORI P, TERZUOLO PG, PIUSSI P 2016 Opportunities for coppice management at the landscape level: the Italian experience. iForest (9): 775-782. DOI: https://doi.org/10.3832/ifor1865-009
- STAJIĆ B, ZLATANOV T, VELICHKOV I, DUBRAVAC T, TRAJKOV P 2009 Past and recent coppice forest management in some regions of South eastern Europe. Silva Balcanica 10 (1): 9-19
- ZENELI G, KOLA H 2017 Coppice Forests: Can Traditional Coppice Forest Managament Help the Western Balkan Region? Forest Research 6 (3). DOI: https://doi.org/10.4172/2168-9776.1000212
- DEKANIĆ S, LEXER M J, STAJIĆ B, ZLATANOV T, TRAJKOV P 2009 European forest types for coppice forests in Croatia. Silva Balcanica 10 (1): 47-62
- PANTIĆ D, KRSTIĆ M, DANILOVIĆ M, MATOVIĆ B, MARKOVIĆ N 2003 Tree development and productivity of beech coppice stands in the Crni Vrh region (in Serbian with English summary). Glasnik Šumarskog fakulteta 87: 175-186. DOI: https://doi.org/10.2298/GSF0387175P
- STOJANOVIĆ LJ, KRSTIĆ M, RADOVANOVIĆ T 2004 Proposition of optimal silvicultural operations in coppice beech forests on mt. Ozren (in Serbian with English summary). Časopis za šumarstvo, preradu drveta, pejsažnu arhitekturu, i zaštitu od erozija 3: 105-238
- VIŠNJIĆ Ć, MEKIĆ F, VOJNIKOVIĆ S, BALIĆ B, BALLIAN D, IVOJEVIĆ S 2010 Ecological and silvicultural characteristics of beech coppice forests in Bosnia and Herzegovina (in Bosnian with English summary). Monograph, Faculty Forestry University Sarajevo, Bosnia and Herzegovina, 154 p
- PINTARIĆ K 2002 Problem of the conversion of beech coppices into high forests of beech (in Croatian with English summary). Šumar list 126 (3-4): 119-128
- KORIČIĆ Š 2005 Biološki, ekološki i ekonomski pokazatelji uspješnosti proreda u panjačama bukve (in Bosnian with English summary). PhD thesis, University of Sarajevo, Faculty of Forestry, Sarajevo, Bosnia and Herzegovina, 220 p
- BALIĆ B 2013 Productivity, structural caracteristics and models of growth and increment coppice beech forest in the Sarajevo Canton (in Bosnian with English summary). PhD thesis, University of Sarajevo, Faculty of Forestry, Sarajevo, Bosnia and Herzegovina, 215 p
- BALIĆ B, VIŠNJIĆ Ć, VOJNIKOVIĆ S, IBRAHIMSPAHIĆ A, LOJO A, AVDAGIĆ A 2016 Ecological, productive and silvicultural categorisation of coppice beech stands in the area of Sarajevo Canton. Works of the Faculty of Forestry University of Sarajevo 45 (2): 83-99
- VIŠNJIĆ Ć, BALIĆ B, VOJNIKOVIĆ S, MEKIĆ F 2017 Ameliorative categorsiation of beech coppice forests at the territory of Canton Sarajevo (in Bosnian with English summary). Monograph, Faculty Forestry University Sarajevo, Bosnia and Herzegovina, 115 p
- LOJO A, MUSIĆ J, BALIĆ B, BAJRIĆ M, SOKOLOVIĆ DŽ, IBRAHIMSPAHIĆ A, AVDAGIĆ A 2017 Analise of the current state and long-term projection of use and conversion of stateowned coppice forests in order to improve of wood production and state of forest in the Sarajevo Canton (in Bosnian with English summary). Naše šume (46-47): 12-29
- IMMITZER M, VUOLO F, ATZBERGER C 2016 First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe. Remote Sens 8 (3): 166. DOI: https://doi.org/10.3390/rs8030166
- BERTA A 2018 Tree species classification in forests of Central Croatia using Sentinel 2 data and data mining. Conference: Natural Resources, Green Technology&Sustainable Developement /3, Zagreb, Croatia
- GAŠPAROVIĆ M, ZRINJSKI M, GUDELJ M 2019 Automatic cost-effective method for land cover classification (ALCC). Computers, Environment and Urban Systems 76: 1-10. DOI: https://doi.org/10.1016/j.compenvurbsys.2019.03.001
- ASTOLA H, HÄME T, SIRRO L, MOLINIER M, KILPI J 2019 Comparison of Sentinel-2 and Landsat 8 imagery for forest variable prediction in boreal region. Remote Sens Environ 223: 257-273. DOI: https://doi.org/10.1016/j.rse.2019.01.019
- WITTKE S, YUA X, KARJALAINENA M, HYYPPAA J, PUTTONENA E 2019 Comparison of two-dimensional multitemporal Sentinel-2 data with threedimensional remote sensing data sources for forest inventory parameter estimation over a boreal forest. Int J Appl Earth Obs 76: 167-178. DOI: https://doi.org/10.1016/j.jag.2018.11.009
- CHEN L, REN C, ZHANG B, WANG Z, XI Y 2018 Estimation of Forest Above-Ground Biomass by Geographically Weighted Regression and Machine Learning with Sentinel Imagery. Forests 9 (10): 528-582. DOI: https://doi.org/10.3390/f9100582
- BALENOVIĆ I, ALBERTI G, MARJANOVIĆ H 2013 Airborne Laser Scanning - the Status and Perspectives for the Application in the South-East European Forestry. South-east Eur for 4 (2): 59-79. DOI: https://doi.org/10.15177/seefor.13-07
- SAFARI A, SOHRABI H, POWELL SL 2018. Comparison of satelitte-based estimates of aboveground biomass in coppice oak forests using parametric, semiparametric and non-parametric modeling methods. Journal of Apllied Remote Sensing 12 (4), 046026 . DOI: https://doi.org/10.1117/1.JRS.12.046026
- CHEN L, WANG Y, REN C, ZHANG B, WANG Z 2019 Optimal combination of predictors and algorithms for forest above-ground biomass mapping from Sentinel and SRTM data. Remote Sens 11 (4): 414. DOI: https://doi.org/10.3390/rs11040414
- NOI PT, KAPPAS M 2018 Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors 18 (1): 18. DOI: https://doi.org/10.3390/s18010018
- STOJANOVIĆ O, DRINIĆ P 1974 Istraživanje veličine koncentričnih kružnih površina za taksacionu procjenu šuma (in Bosnian with English summary). Works of the Faculty of Forestry and Institute of Forestry Sarajevo 34 p
- BALIĆ B, MUSIĆ J, LOJO A, ČABARAVDIĆ A, IBRAHIMSPAHIĆ A 2010 Creation of volume and sortiment tables for beech coopice forests as scientific base for management planning for beech coppice forests in Federation B&H (In Bosnian with English summary). Federal Ministry of Agriculture, Foresty and Water Management, Sarajevo, 80 p
- EUROPEAN SPACE AGENCY 2019 Copernicus Open Acces Hub. URL: https://scihub.copernicus.eu/dhus/#/home (12 February 2019)
- EUROPEAN SPACE AGENCY 2019 Sentinel-2 Spectral Response Functions (S2-SRF). URL: https://sentinel.esa.int/web/sentinel/document-library/content/-/article/sentinel-2a-spectral-responses (23 July 2019)
- RICHARDSON AJ, WIEGAND CL 1977 Distinguishing Vegetation From Soil Background Information. Photogramm Eng Rem S 43 (12): 1541-1552
- ROUSE JWJ R, HAAS RH, DEERING DW, SCHELL JA, HARLAN JC 1974 Monitoring the Vernal Advancement and Retrogradation (Green Wave Effect) of Natural Vegetation, NASA/GSFC Type III Final Report, Greenbelt, MD, USA, 371 p
- JACQUEMOUD S, VERHOEF W, BARET F, BACOUR C, ZARCO-TEJADA PJ, ASNER GP, FRANÇOIS C, USTIN SL 2009 PROSPECT + SAIL models: A review of use for vegetation characterization. Remote Sens Environ 113: 56-66.
- QUINLAN RJ 1991 Learning with Continuous Classes. In: Australian Joint Conference of Artificial Intelligence, World Scientific, Singapore, pp 343-348
- HALL M, FRANK E, HOLMES G, PFAHRINGER B, REUTEMANN, WITTEN IH 2009 The WEKA data mining software: an update. SIGKDD Explorations 11 (1). URL: https://www.cs.waikato.ac.nz/ml/weka/ (23 Sepember 2019)
- CORONA P 2010 Integration of forest mapping and inventory to support forest management. iForest 3 (3): 59-64. DOI: https://doi.org/10.3832/ifor0531-003
- CHIRICI G, GIULIARELLI D, BISCONTINI, TONTI D, MATTIOLI W, MARCHETTI M, CORONA P 2011 Large-scale monitoring of coppice forest clearcut by multitemporal very high resolution satellite imagery. Case study from central Italy. Remote Sens Environ 115 (4): 1025-1033. DOI: https://doi.org/10.1016/j.rse.2010.12.007
© 2019 by the Croatian Forest Research Institute. This is an Open Access paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0).