Mean height or dominant height – what to prefer for modelling the site index of Estonian forests?


 The availability of a large amount of data from reliable sources is important for forest growth modelling. A permanent plot where trees are repeatedly measured provides a clearer picture of stand alterations. Various factors, including forest management, affect forest growth and accuracy of its assessment. In Estonia, mean height as a regression height prediction at mean square diameter is commonly used in forest management practice. Alternatively, dominant height can be used. The main advantage of using dominant height instead of mean height is that the growth of dominant trees is not so strongly affected by stand density (thinning). The aim of our research was to investigate the difference between mean height and dominant height when used as stand height. The research was based on the Estonian Network of Forest Research Plots (ENFRP). As a result, we found that the average mean height change was significantly greater in the case of thinning when compared to undisturbed stand development, whereas, the average dominant height change in the case of thinning compared to undisturbed development was less significant. As a side result, we developed a regression model that can be used for calculating the dominant height of the main tree species using stand attributes (mean height, quadratic mean diameter and density) with a residual standard deviation of 0.466 m.


Introduction
Quantitative information on stands and trees is needed for assessment of available forest resources and forest management planning (Burkhart & Tomé, 2012). A consistent and reliable flow of forest growth information is necessary for the development of an accurate mathematical model (Kiviste & Hordo, 2002). Mathematical models will provide valuable insight for resource estimations, management option exploration and silvicultural alternatives (Vanclay, 1997). Information in the form of raw data can be collected from forest sample plots, which can be either permanent or temporary (Räty & Kangas, 2019). Permanent plots are repeatedly measured with intervals while temporary plots are measured once. Modelling forest growth using temporal sample plots can give distorted results, as it does not provide information on each individual tree over time (Gadow & Hui, 1999). Permanent sample plots where trees are individually and permanently marked for repeated measurements are a better choice for forest growth modelling (Picard et al., 2010). However, establishing and maintaining permanent plots with continuous measurements is expensive (Allen, 1993).
Stand development is a continuously evolving process that requires more realistic and functional prediction methods in order to estimate sufficiently close to real values. Various factors, including forest management, can change forest growth and accuracy of its assessment. Roughly 40% of Estonian forests belong to the Estonian state and are largely maintained, grown and managed (thinned) by the State Forest Management Centre. Thinning is used in forest management as a tool for regulating tree competition and give an advantage to the select remaining trees (Pretzsch, 2009). Thinning studies have consistently shown an increase in individual tree diameter and volume growth with decreasing stand density (Mäkinen & Isomäki, 2004;Kim et al., 2016). The main reason for the widespread use of thinning in forestry management is its economic benefit.
Forest stand height, which is an important variable in forest growth modelling, can be defined either as mean or dominant stand height. Mean height can be estimated as an arithmetic average of all trees in a stand or as mean height weighted in proportion to their basal area (also known as Lorey's mean height) (Lorey, 1878). However, in this study we calculated mean height as a regression height prediction at mean square diameter, as it is the most commonly used method in Estonia (Forest Management Act, 2009;Padari et al., 2009). Dominant stand height, which is suggested to be a better measure of site productivity, is widely used in other countries (van Laar & Akça, 1997;Burkhart & Tomé, 2012) and could also be an alternative for Estonia. A primary prerequisite in using dominant height as a site index productivity measurement, instead of mean height, is that the height growth of dominant trees is not so greatly influenced by stand density (Weiskittel et al., 2009). Therefore, the use of dominant height should be a more precise way of evaluating stand growth (Burkhart & Tomé, 2012).
In literature, different names and definitions of dominant height have been proposed (West, 2009). One of the most common definition is top height (sometimes used as a synonym for dominant height), which is defined as the average height of a fixed number of trees per unit area (stand) with the largest diameters at breast height (commonly over bark) (Elfving & Kiviste, 1997;van Laar & Akca, 1997;Sharma et al., 2002;West, 2009;Burkhart & Tomé, 2012). A second definition, predominant height is defined as the average height of a fixed number of the tallest trees per unit area (stand) (West, 2009). Trees with a large dia meter are often easier to measure, hence the use of the thickest trees is more practical than the use of the tallest trees (Burkhart & Tomé, 2012).
The current study uses dominant height, which is defined by the IUFRO (International Union of Forest Research Organizations) as the average height of the 100 largest trees per hectare (dominant trees) (Tomé et al., 2019). The IUFRO's forest terminology prescription does not specify whether height or diameter should be used as the largest tree identifier. Since both identifiers are allowed, the following research defines dominant height as the average height of the 100 trees with the largest diameters at breast height per hectare, as it is more appropriate for the research. In the current study, we limited the comparison of mean and dominant height only to the main tree species on each plot. The term cohort will be used in our study as a grouping method, which combines tree species and storeys into subgroups.
The aim of the research is to evaluate whether there is a difference in the accuracy of stand height estimation, when comparing the mean height model and dominant height model. We hypothesize that thinning has no statistically significant effect on dominant height and therefore dominant height should be more accurate for site index productivity measurement.

Material and Methods
This study is based on a dataset of repeated measurements from the Estonian Network of Forest Research Plots (ENFRP). The ENFRP was established in 1995 and is active and maintained to this day. When the ENFRP was founded, its main purpose was to collect data for the construction of individual tree growth models in Estonia (Kiviste & Hordo, 2002). Nowadays, the ENFRP has become even more relevant as it is an important database for the national research infrastructure (Kiviste et al., 2015).

ENFRP measurements
The ENFRP consist of circular sample plots with radiuses of 15, 20, 25 or 30 meters. The size of the radius depends mainly on the density of the stand and as a rule, EN-FRP plots include at least 100 first storey trees. If a sample plot is thinned between two consecutive measurements, the plot is enlarged to follow the 100-tree rule, by increasing the radius. The re-measurements are carried out at 5-year intervals. The sample plots are systematically distributed over Estonia and are mostly established in a group of three plots. According to the field measurement protocol, the stem diameter at breast height (measured at 1.3 m above the root collar) is recorded in two perpendicular directions for each individual tree. Damage and the cause of dead tree mortality are examined and recorded in addition (Kiviste et al., 2015).

Research data
This study is based on a dataset which was extracted from the ENFRP database on 11.03.2020. In our study, 753 sample plots were used, which were measured a total of 3,132 times (33 plots were measured once, 41 plots twice, 39 plots three times, 305 plots four times, 335 plots five times, and 5 plots six times). These sample plots included 139,125 measurements of different individual trees, which were measured a total of 412,604 times. The main tree species of the ENFRP plots in this study were Scots pine (Pinus sylvestris L.) -347 plots, Norway spruce (Picea abies (L.) H. Karst.) -210 plots, silver birch (Betula pendula Roth) on mineral soil or silver birch in mix with downy birch (Betula pubescens Ehrh.) on wet sites -143 plots, European aspen (Populus tremula L.) -34 plots, common alder (Alnus glutinosa (L.) Gaertn.) -6 plots, grey alder (Alnus incana (L.) Moench) -6 plots, goat willow (Salix caprea L.) -6 plots and small-leaved lime (Tilia cordata Mill.) -1 plot. According to the forest site typology (Lõhmus, 1984), the ENFRP plots were classified into 11 forest type groups, the most frequent of which were mesotrophic, meso-eutrophic and nemoral forests (Figure 1).

Relation between dominant and main tree species
A considerable share of Estonian forests has a semi-natural status, which means that even commercially planted forests contain some trees of natural regeneration in the first storey (Lõhmus et al., 2013). Figure 2 shows that sample plots where Scots pine is the main species are mostly dominated by Scots pine trees. However, sample plots with Norway spruce and other deciduous species (European aspen, common alder, grey alder, goat willow, small-leaved lime) have a more mixed composition in the cohort of dominant trees. Thus, according to our calculation, trees of the main tree species are not always dominant trees.

Mean square diameter and dominant diameter calculation
The diameter of each individual tree is recorded at each sample plot measurement according to the ENFRP field measurement protocol. However, only 30.5% of  the tree heights were recorded during the measurements, as the methodology stipulates that only a part of the tree heights is recorded during the sample plot measurements. Since a height measurement was required for each individual tree, we estimated the parameters of the height curve from height-diameter data obtained at sample plot measurement. First, we calculated the mean square dia meter (root mean square) d g for the main tree species of the first storey for each plot measurement. Second, we calculated the dominant diameter d g dom. However, in order to calculate the dominant diameter d g dom, we had to find a way to distinguish dominant trees from individual trees in a sample plot. We decided to calculate the number of dominant trees to be included per sample plot from the radius of the plot. For example, sample plots with a radius of 15, 20, or 25 had an area of 0.071, 0.126, or 0.196 hectares and therefore, 7, 13, or 20 trees with the largest diameter at sample plot measurement were used as dominant trees. In total, 38,960 tree measurements were treated as dominant tree measurements. The dominant diameter d g dom was calculated with the same formula (1) as the mean square diameter, but only dominant trees were used instead of all trees. (1) where d g -mean square diameter (calculated separately for each sample plot); d i -breast height diameter of an individual tree (cm); n -number of trees in a cohort.

Height curve selection
There is a substantial number of mathematical functions (height curves), which can be used to approximate the heightdia meter relationship of trees in forest stands (eg. Padari, 1994;van Laar & Akça, 1997;Sharma, 2009;Mehtätalo et al., 2015). In order to find the best solution for the dominant height estimation, we tested five different models which have been used for modelling the height-diameter relationship for Estonian forests in recent years. Three models (2, 3, 5) were based on Näslund's (1937) curve and two models (4, 6) were based on Nilson's (2002) transformation of the Hossfeld forest growth function (Hossfeld, 1822;Peschel, 1938). In the growth model calculation, we only used one cohort at each sample plot, which is the main trees species of the first storey. Tree-specific coeffi cients of the growth curves are presented in Table 1.
where h -tree height (m); d -tree diameter (cm); a and b -model parameters estimated for each cohort using nonlinear regression (Näslund, 1937).
where h -tree height (m); d -tree diameter (cm); a and b -model parameters estimated for each cohort using nonlinear regression (Näslund, 1937).
where h -tree height (m); d -tree diameter (cm); d g -quadratic mean diameter; H -model parameter (Nilson, 2002;Kiviste et al., 2003) estimated for each height-diameter cohort; c sp -tree-specific coefficient (Nilson, 2002;Kiviste et al., 2003); a sp and b sp -tree-specific coefficients (ForMIS 2020, model 206, estimated by A. Sims from EN-FRP data). The calculations with the set of height curve test data were performed as follows: 1. We joined consecutive tree measurements pairwise, which were made at five-year intervals. We calculated the annual increase in diameter and height (id and ih) for each period. The annual increment was calculated only when the diameter and height of the tree were recorded in the database at both the beginning and the end of the period. Only the main tree species of the first storey was included in the calculations. 2. A mixed effect model with a random intercept (Pinheiro & Bates, 2000) id kj = β 1 + β 2 · d kj + β 3 · rd kj + b k + ε kj was created where id kj denotes diameter annual increment of tree j on plot k. Part β 1 + β 2 · d kj + β 3 · rd kj is the fixed part and part b k + ε kj is the random part of the model. The fixed variables d kj and rd kj denote diameter and relative diameter (ratio of tree diameter and sample plot dominant diameter) of tree j on plot k.
The random effect of plot b k and tree ε kj is assumed to be independent and have normal distribution b k ~ N(0, σ b 2 ) and ε jk ~ N(0, σ 2 ). Values β 1 , β 2 , β 3 , σ b 2 and σ 2 are model parameters estimated with R package lme4 function lmer. For excluding diameter increment outliers, threshold values were determined as 0.025 and 0.975 quantiles of the id model residuals ( Figure 3). A similar approach was applied for excluding height increment outliers. It was done in order to avoid human mistakes during forest measurements and data inputs. 3. After excluding id and ih outliers, plot measurements with at least 16 main species height-diameter measurements were included in the height curve test dataset. The test dataset consisted of 64,281 height-diameter records for 6,581 dominant trees at 1,798 plot measurements.
In order to characterize the suitability of the height growth models (2-6), we calculated estimates of their parameters for each height-diameter cohort using the test dataset and predicted height for each main species dominant tree with all five models. Table 2 presents the calculated statistics for h measured and h predicted , which characterize the systematic and random error of the models. It also shows t-test results and the decrease in the height curve. The specifications of Table 2 are given in Table 3.
We calculated dominant heights h g dom with five different height curve models (2-6) at dominant diameter d g dom for each cohort. In addition, we calculated the arithmetic mean height of the main  species dominant trees (if possible) for each cohort for comparison. Figure 4 shows the difference between the estimated dominant height h g dom and the empirical arithmetic mean height of the 100 thickest trees per hectare. The difference boxplots of all five candidate height curve models are presented in the same figure.
Model 4 (Nilson, 2002) fits our dominant height test data most accurately, but because it contains two parameters and the elevation curve may be descending for certain plots, it was not the best choice. We chose the single-parameter height curve model (6) instead. According to Table 2, model (6) had a slightly larger height prediction error (sde = 1.45 m) compared to model (4), which had an estimation error of 1.34 m. However, model (6) maintained the basic height curve requirement (non-decreasing) for all height-diameter cohorts of the main species in the test dataset.

Mean height and dominant height relationship
We applied the one-parameter height curve function (6) on height-diameter cohorts of the main tree species for all the ENFRP plot measurements. Due to possible outliers in height-diameter measurement data, we estimated function (6) parameter H with a robust approach as the median estimate. Using the model (6) with the median-estimated parameter H, we calculated the mean height h g as the model prediction at the mean square diameter d g and the dominant height h g dom as the model prediction for the dominant desc -despite excluding outliers and a moderate number of observations (n ≥ 16) in each cohort, the height curve was descending (parameter a < 0 for models 2 and 3, parameter b < 0 for model 4) in the case of some cohorts. diameter d g dom of the main tree species of each plot measurement. We studied the relation ship between the dominant height and other stand variab les with the multiple regression method.

Thinning and mortality effects on mean and dominant stand height
A dataset of pairwise consecutive tree measurements (317,588 measurement pairs) were compiled in order to calculate the basal area change due to thinning and mortality. The aggregated thinning and mortality data were merged with other stand (plot) variables (age, mean square diameter, mean and dominant height etc.) into pairwise consecutive plot data. The purpose of the new dataset was to investigate whether thinning and mortality have an effect on the mean and dominant height. Figure 5 shows the share of the thinned basal area in relation to the basal area of the main species between two consecutive plot measurements depending on stand age. We introduced a binary variable "TH", referring to "thinned", when at least 5% of the basal area of the main tree species was cut at 5-year intervals and "NTH" (not thinned) otherwise. There were 364 measurement intervals with thinning above the 5% threshold percentage (thinned) and 1,959 measurement intervals without thinning or below the 5% threshold (not thinned). The effect of thinning on changes in the mean height and dominant height during the measurement interval was studied by comparing means with confidence intervals.
In order to study the effect of mortality on mean and dominant height, we introduced a continuous variable, which expresses the share of the dead tree basal area in relation to the basal area of the main species in two consecutive plot measurements depending on stand age ( Figure 6).

Evaluation of height growth models on mean height and dominant height growth predictions
In order to study the accuracy of mean and dominant height as stand height in forest growth predictions, two models were tested and compared. The Estonian difference model (Kiviste, 1997) was used to calculate the predicted mean height growth. The model is based on the 1984-1993 Estonian State Forests Inventory data (Kiviste, 1997). Mean height growth was predicted using the following equations: where H 2 -estimated mean height growth (m); H 1 -mean height at the beginning of the period (m); A 1 -stand age at the beginning of the period (year); A 2 -age of stand at the end of the period (year); OH -thickness of the organic layer of the soil (cm); Kstand formation (uncultivated, K = 0, or cultivated, K = 1); X, Y, Z -auxiliary variables. As a second model, the Swedish Scots pine growth model (Elfving & Kiviste, 1997) was used to predict the dominant height. The model is based on repeated measurements of the permanent plots measured by the Swedish University of Agriculture and uses dominant height in its calculation. Dominant height growth was predicted using the following equations: where H 2 -dominant height at the end of the period (m); H 1 -dominant height at the beginning of the period (m); A 1 -stand age at the beginning of the period (year); A 2age of stand at end of the period (year); r -auxiliary variable. The results of the mean and dominant height models were compared with the values of the evaluated mean and dominant height in order to validate the accuracy of model estimates. All the calculations were made in R version 3.6.3.

Relationship between dominant height and mean height
The difference between the mean height h g and the dominant height h g dom is presented in Figure 7. Scatterplots only include the main tree species of each ENFRP sample plot. The results show a nonlinear pattern of the relationship between mean and dominant height, which indicates the possible influence of several stand variables (age, mean square diameter, mean and dominant height etc.).
After trying several variable combinations, we developed a regression model (9). The model can be used to calculate the dominant height of the main tree species with stand attributes (mean height, quadratic mean diameter and density). Species-specific coefficients of the model are presented in Table 4.  height curve at dominant diameter d g dom (cm); h g -mean height (m) of the main tree species with height curve d g (cm); d g -mean square diameter (cm) of the main tree species, N -number of upper-layer trees per ha, a 0 , a 1 , a 2 -species-specific coefficients (Table 4).
The residual standard deviation of the regression equation (9) was 0.466 m and the coefficient of determination R 2 = 0.995. All coefficients were highly significant (p-value < 0 .0001).

Effect of thinning on mean height and dominant height
The difference in the thinning effect on the 5-year increment of mean height and dominant height is shown in Figure 8, where the average change in mean height and dominant height with 95% confidence intervals is presented. The dataset containing 5-year measurements was split by thinning ("TH" or "NTH") and by tree species. Confidence intervals for mean height and dominant height were calculated on the same dataset. Figure 8 shows that for commercially important tree species (Scots pine, Norway spruce, silver birch), the change in the average mean height is significantly greater in the case of thinning when compared to undisturbed stand development. Whereas the change in the average dominant height in the case of thinning compared to undisturbed development was not significant. This is evidence of the Estonian forest management practice of applying thinning from below. The other deciduous tree species include species that are cut out at a thinning event, as economically beneficial species (Scots pine, Norway spruce, and Silver birch) are preferred. Figure 9 shows the increase in dominant height between the beginning and the end of a five-year period, however, effect of thinning is not apparent. Figure 10 shows height growth model errors for the mean height model (7) and the dominant height model (8). The figure shows that both models underestimate the height growth on forest permanent plots. If thinning occurs between two measurements then mean height growth model error is higher than dominant height growth model error. However, when thinning do not occur between the measurements then obvious difference between the forest growth model errors is not evident.

Discussion
The dominant height can be calculated in different ways (West, 2009). We calculated dominant height as the model prediction at the dominant diameter of each plot measurement. The advantage of our approach is that the dominant height can be calculated even if there are no dominant trees of the main tree species. The disadvantage of this approach is that we are extending the height curve of the main tree species to all tree species. This might cause systematic errors as the height curve is calculated for main tree species trees and not for all trees.
As an interesting additional result, we developed a regression model that can be used to estimate the dominant height of the main tree species by stand attributes (mean height, quadratic mean diameter and density) with a residual standard deviation of 0.466. The outcome is significant as the prediction precision was more than twice as accurate when compared to the empirical dominant height prediction of model (6), which had a residual standard deviation of 1.45 m. This means that the dominant height could be calculated easily from stand attributes as a new stand variable without additional measurement work. The average change in the mean height and dominant height with a 95% confidence interval. The sample plots were measured at a 5-year interval. Red dots represent thinned sample plots and blue dots represent unthinned sample plots. The difference in the dominant height between the beginning and the end of the fiveyear period. Red dots represent thinned sample plots and blue dots represent unthinned sample plots.
Both height growth models (7, 8) estimated the height growth to be greater when compared to the subsequent stand plot measurement data ( Figure 10). The dominant height model estimated the height growth more accurately when thinning took place between the measurements. The results suggest that the dominant height could be used in Estonia to get more accurate forest growth estimation results as a large portion of Estonian forests are managed. Our study corresponds with other studies (Amaro et al., 2003, Medeiros et al., 2017, del Río et al., 2017, which have also found that thinning has no statistically significant effect on accuracy of forest growth predicting, when dominant height is used as stand height. However, as our results are based on sample plots and not on managed forest stands, further research is advised. Information on the thinning effect is essential for a more accurate forest growth model and provides insight into forest management (West, 2009). From an economic point of view, the main purpose of quantifying the thinning response is to determine the range of stocking density that maximizes the growth potential of timber. It is also essential to find the minimum stock necessary to obtain the largest individual tree size without reducing the volume of the stand (Skovsgaard, 2008).
In addition to artificial thinning, natural thinning occurs and should be considered while interpreting the results. Stand density and site conditions have a significant effect on the number of dead and dying trees (Zhang et al., 2008). Tree mortality is one of the most important factors in understanding forest dynamics (Sims et al., 2014). Research on Estonian forests has found that individual tree death occurs mainly due to growth-dependent reasons (45%), while fungi (23%) and wind damage (16%) play also an essential part in tree mortality . The accuracy of the forest growth model largely depends on the precision of the tree survival prediction . For an accurate forest growth model, it is important to understand all the processes that may affect forest growth, such as regeneration, growth, mortality, forest dynamics, and natural and artificial thinning.

Conclusion
Based on the results of this study, we can conclude that using the mean height instead of the dominant height would help with accuracy of forest growth predicting when thinning takes place between measurements.
If there was no thinning between subsequent measurements, the accuracy of the predictions of the dominant height forest growth model was approximately the same as that of the mean height growth model. Dominant height can be calculated using regression equation from other stand attributes (mean height, quadratic mean diameter and density).