Genetic Diversity of Pedunculate Oak ( Quercus robur L . ) in Clonal Seed Orchards in Croatia , Assessed by Nuclear and Chloroplast Microsatellites

background and Purpose: Natural stands of pedunculate oak in Croatia have been delineated in seed areas, zones and regions. The current bylaw recommends that the transfer of reproductive material remains limited within zones, but that it is permitted within areas. Clonal seed orchards (CSOs) of pedunculate oak were established to increase genetic quality of seed and to acquire a more regular seed yield than in natural stands. In total 150 plus trees were selected within three seed regions. The selection included a number of favourable traits of tree size and stem quality. Three CSOs were planted with grafted plus-trees. We aimed to establish whether these orchards encompass enough genetic diversity to potentially produce genetically improved and sufficiently diverse reproductive material. We also wanted to assess neutral genetic differentiation between these orchards and compare it with the genetic diversity obtained from chloroplast DNA markers, depicting conserved lineages from recolonization routes. We wanted to investigate spatial genetic structure in the area of our research and provide additional information on the transfer of forest reproductive material. Materials and Methods: Leaves were collected from all clones in the CSOs. Total genomic DNA was extracted and clones were analysed with eight nuclear and ten chloroplast microsatellite markers. Spatial autocorrelation analysis was performed with nuclear microsatellite data and original plus trees’ coordinates, for each CSO separately, to determine whether shared favourable traits among the selected plus trees in smaller distances are the results of relatedness, which the sampling strategy tried to avoid. results: We found 28 chloroplast haplotypes belonging to four maternal lineages, and significant differentiation between CSOs, indicating origin from different refuges. Nuclear microsatellites’ diversity in the CSOs is quite high and comparable to diversity found within a recent study of Croatian natural populations. Nuclear microsatellites did not show genetic differentiation between CSOs, i.e. between the seed regions and seed zones they represent. No genetic differentiation was found with nuclear microsatellites among haplotypic lineages. We found no genetic structure within the analysed regions. Conclusions: Lack of differentiation between CSOs found with nuclear microsatellites confirms the permission for transfer of reproductive material between zones within the seed area 1 Lowland Forests. If original differentiation between chloroplast haplotypic lineages was present after recolonization, it was erased by subsequent gene flow. Lack of genetic structure, with nuclear microsatellites within regions indicates successful sampling strategy.


iNtrODUCtiON
Pedunculate oak (Quercus robur L.) is one of the ecologically and economically most important tree species in Croatia.Based on various ecological and management criteria, as well as research results, natural stands of pedunculate oak in Croatia were delineated in seed management units.According to the current bylaw on forest tree species provenances [1], they are divided into three seed areas, four seed zones and eight seed regions.To increase genetic quality of reproductive material and overcome irregular periodicity of masting in natural stands, a breeding programme for pedunculate oak improvement started in the 1980s [2].Plus trees were selected in the seed area 1 -Lowland Forests, within three seed regions 1.1.2,1.2.1, and 1.2.3, which belong to two seed zones, 1.1.and 1.2.(Table 1), based on multiple criteria of tree size and stem quality.Three clonal seed orchards (CSOs) were established with their vegetative copies (grafts).
The knowledge of the species' genetic diversity and the variability of different traits in various environmental conditions becomes especially important in the context of climate change.This particularly refers to the production of well-adapted forest reproductive material.Current research of quantitative traits' genetic variability of pedunculate oak in Croatia has shown that dominant component lies in intrapopulation variability [3][4][5][6][7][8][9].
Similar pattern of genetic variability (high intrapopulation and low interpopulation variability) and weak spatial structure of populations in greater distances can also be expected for nuclear microsatellites research on pedunculate oak, as well as sessile oak, when investigated populations belong to large uninterrupted complexes of these species [10,11].This was confirmed in recent research of natural pedunculate oak populations in Croatia with nuclear microsatellites, since only a few populations were significantly genetically differentiated and they did not belong to a more uniform complex with other populations [9].In case of more fragmented populations, especially under strong anthropogenic influence, intrapopulation differentiation becomes higher, although still not as high as intrapopulation variability [12,13].
Regarding plant species in general, biological factors such as limited gene flow by seed or pollen can lead to genetic differentiation even in continuous populations of species.In such cases, fertilization is not random, but rather conditioned by the distance between individuals [14].Although the majority of other investigated plant species show pronounced genetic structures in short distances, this is rarely found in forest tree species.In these species the structure is weak or non-existent, sometimes described as almost random [15].Interestingly, weak genetic structure is noticed in case of forest tree species with different reproductive systems and history, which can be attributed to their joint characteristics: long lifespan and stronger gene flow compared to other plant species.Selection can also contribute to creating genetic structure patterns and enhance or diminish genetic differentiation.Selection caused by diversity of microecologic conditions in the stand can trigger the creation of spatial patterns, while alternatively, in case of inbreeding depression, selection can enhance the dying off of inbred individuals and lead to weakening of genetic structure [16].
Regarding pedunculate oak, there is pronounced asymmetry of gene flow by seed and pollen.Pollen is spread by wind, while seeds are large and heavy and can be carried to larger distances only by animals, humans or along river flows.Research shows that in pedunculate oak gene migration by pollen is approximately two hundred times stronger then migration by seed.This difference is manifested in weaker differentiation of populations when analysed with biparently inherited nuclear markers, while at the same time differentiation by uniparentaly inherited cytoplasmic (in case of broadleaves chloroplast) markers can be substantial.Presumably, limited gene flow by seed primarily causes spatial patterns, but they are subsequently erased by constant and strong gene flow by pollen [15,17,18].
Research on correlation of chloroplast markers' diversity with nuclear molecular markers and quantitative traits showed complete incoherence of chloroplast lineages with quantitative traits, and weak concordance with allozyme frequencies [19].It led to the following conclusions: (1) During the last glacial deciduous oaks were probably limited to three main refuges, which led to genetic differentiation of both nuclear and chloroplast genomes; (2) At the end of the glacial oaks migrated to the north, at which point spatial pattern of chloroplast diversity was established, which has remained highly conserved until today; (3) As oaks progressed in colonization towards central and northern Europe, gene flow by pollen reinstated communication between populations originating from western, or eastern refuges, which led to gradual loss of original differentiation in parts of nuclear genomes.During that process, differentiation in chloroplast genomes remained approximately the same; (4) Local selection pressures influencing the newly established populations gradually led to differentiation of parts of nuclear genomes whose new patterns were completely different from those immediately following the recolonization.During that whole period, chloroplast diversity remains conserved since it is not connected to quantitative traits.All that remains is a weak connection to nuclear markers that are not under strong influence of selection.
The aim of our research is the evaluation of two types of genetic diversities within and among three CSOs, established with grafts of plus trees selected within three legally delineated seed regions, belonging to two seed zones.The first type is genetic diversity of nuclear microsatellite markers, presumably neutral, highly polymorphic, biparently inherited markers that point to gene flow by both seed and pollen.The second type is diversity of maternally inherited chloroplast microsatellite markers, pointing to gene flow only by seed.
Genetic diversity analysis of clones in CSOs will demonstrate whether the selected parental populations, i.e.CSOs, consist of enough diverse and unrelated genotypes to potentially produce genetically diverse and improved seed for filling the natural stands and aiding their regeneration.Genotyping by nuclear microsatellites also lays ground for parentage analysis of crops produced in the CSOs, to determine the true efficacy of the CSOs in producing genetically diverse progeny.Although nuclear microsatellite markers in this study are selectively neutral, high genetic diversity of CSOs for these markers would also point to potentially wider adaptation potential, which is especially important in the context of climate and habitat change [20].
Considering the known positions of original plus trees in the stands, one of the aims is to analyse spatial distribution of nuclear microsatellite genetic diversity within each region.We want to investigate whether there is geographically influenced genetic structure at certain classes of distance, detected by a kinship degree of the selected trees within a region.This way we want to eliminate the possibility that multiple favourable traits shared among geographically close plus trees result from their kinship, which the sampling strategy of 50 m distance between the selected trees tried to avoid [2].
By using chloroplast microsatellites, which reflect the genetic diversity established by postglacial recolonization routes, we will analyse the diversity of haplotypes within and among regions.The results will demonstrate lesser or greater homogeneity of individual regions and possible anthropogenic influence in haplotype diversity of certain regions.
By comparing the analysis with nuclear and chloroplast markers we will try to establish whether the original differentiation by recolonization routes remained detectable in the analysed loci of the nuclear genome.

Study Area
The leaves for DNA analysis were collected in three CSOs.Detailed information on CSOs, their position and the list of forest districts where plus trees were selected is shown in Table 1, while the position of the CSOs is shown in Figure 1.

DNA extraction and Microsatellite Analysis
In spring we collected young leaves from one ramet of all clones in the three CSOs.The leaves were kept at -60 o C until DNA extraction.We extracted total genomic DNA with GenElute Plant Genomic DNA Miniprep Kit (Sigma®).
Detailed description of the primers, multiplexes and PCR protocols can be found in Deguilloux et al. [23]..The 5' primers were labelled with different fluorescent dyes (WellRED Oligos, Proligo).All PCRs were carried out by PTC-100 thermal cycler (MJ Research®).Fragment separation was performed by capillary electrophoresis on CEQTM 8000, Genetic Analysis System (Beckman & Coulter), and alleles were scored with CEQTM 8000 Fragment Analysis Software.

A) Nuclear Microsatellites (nSSrs) a1) Genetic Diversity and Differentiation
The total number of alleles per locus (Na), the observed HO and HE, expected heterozygosity and Polymorphism Information Content (PIC; [26]) for each microsatellite locus, as well as the average number of alleles (Navg), HO and HE in each region were calculated using PowerMarker v3.25 software [27].The allelic richness (Nar) as the measure of the number of alleles per locus independent of sample size was calculated by FSTAT v.2.9.3.2 programme package [28], while the number of private alleles (Npr) per region was assessed by MICROSAT [29].The estimates of Nar, HO and HE among regions were compared using the Kruskal-Wallis test in SAS Release 8.02 (SAS Institute, 2004).
To establish whether the analysed regions belong to Hardy-Weinberg equilibrium we used Wright's F statistics [30].If there are deviations, the possible reasons are inbreeding within the regions or genetic differentiation between the regions.Entry data are allele frequencies of analysed markers in sampled regions.Potential heterozygote deficiencies, as a result of inbreeding, is described by fixation indices; F IS , on individual level, and F ST , on population (in this case regional) level.
To calculate F IS [31] for each locus in each region and to test congruence between the observed and expected genotypic frequencies under Hardy-Weinberg equilibrium we used GENEPOP v. 3.4 [32].All probability tests are based on Markov chain method [33,34], using 10000 steps of repeated memory, 100 data groups and 5000 repetitions per group.To correct it for multiple testing we used sequential Bonferroni adjustments [35,36] implemented in the SAS Release 8.02 programme (SAS Institute, 2004).
Genetic differentiation among regions was estimated by FST using Weir and Cockerham method [31] as implemented in FSTAT.The significance of FST estimates was assessed by randomization.We calculated Nei's standard genetic distances [37] between pairs of regions with GENDIST programme implemented in PHYLIP ver 3.6b software [38].

a2) Analysis of Molecular Variance (AMOVA)
The proportion-of-shared-alleles distance [39] between pairs of individuals was calculated using MICROSAT and the distance matrix was subjected to the analysis of molecular variance (AMOVA; [40]) using ARLEQUIN version 2.000 [41].AMOVA was used to partition the total microsatellite diversity into among-region and within-region components.The variance components were statistically tested by non-parametric randomisation tests using 10,000 permutations.

a3) Spatial Autocorrelation Analysis
Spatial autocorrelation analysis demonstrates relationships between specific measures of autocorrelation in different geographical distance classes, by means of correlograms, i.e. distograms.It is also possible to test significance of autocorrelation measure for each individual distance class.Since we analysed nuclear microsatellite markers, we used kinship coefficient for co-dominant markers by Loiselle et al. [42], described in detail in SPAGeDi 1.2g programme manual [43].
When plus trees were selected in the stands, their positions were mapped on sub-compartment management maps.Based on those sketches, we transferred their position on digitised management maps in ArcGis 9.2.(ESRI, 2007) programme and retrieved Gauss-Krüger coordinates for spatial analysis.
We performed spatial autocorrelation analysis with SPAGeDi 1.2g programme.Geographical distance between individuals for each region separately was calculated based on Gauss-Krüger coordinates of plus trees.Pairs of plus trees were classified into ten distance classes, containing approximately the same number of trees in each class.Based on nuclear marker analysis we calculated a matrix of kinship coefficient for co-dominant markers.The significance level of average kinship coefficients of individual distance classes was determined by 1000 permutations.Genetic distances between haplotypes were calculated in MICROSAT [29] using absolute distance (D AD ) metric based on the sum of the number of repeat differences between cpSSR haplotypes (described in detail in Deguilloux et al. [23]).
Cluster analysis based on distance matrix was performed using the unweighted pair-group method (UPGMA) a s implemented in NEIGHBOR programme of the PHYLIP ver.3.6b software [38].The reliability of the UPGMA topology was assessed via bootstrapping [44] of over 1000 replicates generated by MICROSAT [29] and subsequently used in NEIGHBOR and CONSENSE programmes in PHYLIP.

b1.2) Median -Joining Network of Haplotypes
A Median-Joining (MJ) network [45] was constructed based on cpSSR haplotypes using the Network 4.5.1.6. programme [46].This method is used in discovering very complex patterns of haplotypes.It uses parsimony criteria for finding median vectors, i.e. consensus sequences or mutually close marker sequences that present a biological equivalent of possible non-sampled or extinct ancestral haplotypes.

b2) Genetic Diversity and Differentiation
Genetic diversity was estimated in each population and in the metapopulation as the number of haplotypes (n h ), the number of haplotypes per the number of individuals (n h /n), the number of private haplotypes (n ph ; i.e. haplotypes found in a single population), the effective number of haplotypes (n E ), haplotype richness (n hr ), haplotype diversity (H E ), and the frequency of the most common haplotype (f h ).[47] Haplotype richness (n hr ) as the measure of the number of haplotypes per population independent of sample size was calculated by rarefaction method [47] as implemented in Contrib [48].
An unbiased estimate of the haplotype diversity (H E ) was calculated according to Nei [49].
Genetic differentiation was analysed by three different methods described in Pons and Petit [50,51], using the Permut programme [52]: a) Unbiased estimation of G ST [49] according to Pons and Petit [50], considering only haplotype frequencies (unordered haplotypes).Unbiased estimation of G ST involved the calculation of the diversity of each region (h k ), average within-region diversity (h s ), and total diverstiy (h t ) b) The estimation of N ST [51] for which the genetic similarities between haplotypes (proportion of shared fragments) were taken into consideration (ordered haplotypes).The eof N ST involved the calculation of the diversity of each region (v k ) by taking into account similarities between the haplotypes, average withinregion diversity (v s ), and total diversity (v t ).c) The estimation of R ST [53] for which the genetic distances at individual SSR loci were taken into consideration according to a microsatellite stepwise mutation model (SMM).The estimation of R ST involved calculations similar to those used for the estimation of N ST except that the distance between two haplotypes was the sum (across all loci) of squared difference in the number of repeats [53].This method was specifically designed for microsatellites by assuming the stepwise mutation model (SMM), while the N ST method makes fewer assumptions about the nature and complexity of mutational patterns detected by fragment length analysis [51].Detailed calculations of all methods can be found in Deguilloux et al. [23].
A comparison of differentiation and diversity measures for ordered (N ST and R ST ) vs. unordered (G ST ) haplotypes was carried out according to Pons and Petit [51].Significance was tested on the basis of 1,000 random permutations using the Permut programme [52].

b3) Analysis of Molecular Variance (AMOVA)
The partition of cpSSR diversity within and among regions was studied using the Analysis of Molecular Variance (AMOVA, [40]) as implemented in ARLEQUIN [41], while the significance of AMOVA was tested based on 10,000 permutations of haplotypes among populations.Again, two genetic distance measures between pairs of haplotypes were used based on (1) the number of different alleles (F ST ), as well as on (2) the sum of squared size differences (R ST ).

AMOVA of Haplotypic Lineages, with nSSrs
Based on the results of cluster analysis the haplotypes were grouped into lineages.
The partition of nSSR diversity within and among lineages (based on cpSSR) was tested using AMOVA in ARLEQUIN [41].The variance components were tested statistically by non-parametric randomisation tests using 10,000 permutations.

A) Nuclear Microsatellites a1) Genetic Diversity and Differentiation
Parameters of genetic diversity of analysed nuclear microsatellite loci are shown in Table 2.The values of PIC were in the range of 0.280 to 0.915, with an average of 0.739.Five out of eight markers had PIC values higher than the average, which makes them highly efficient in assessing genetic diversity of these regions.Markers ssrQrZAG96, ssrQrZAG108 and ssrQrZAG112 had lower levels of PIC, of which ssrQrZAG96 was least polymorphic.
The greatest number of alleles was found in markers ssrQrZAG112, ssrQrZAG31 and ssrQrZAG87, and the lowest in ssrQrZAG96, with an average number of 20.25 across all loci.
Expected heterozygosity (H E ) ranged from low 0.288 in ssrQrZAG96 to 0.915 in ssrQrZAG31, with an average of 0.750 across all loci.Observed heterozygosity (H O ) in most loci ranged between the expected values, with an exception of locus ssrQrZAG31, and minor deviations in loci ssrQrZAG108 and ssrQrZAG87.In loci ssrQpZAG9 and ssrQpZAG36 H O was greater than H E .
F ST values showed significant differentiation between regions only for loci ssrQrZAG96 and ssrQrZAG108.On average, there was no significant differentiation between regions.
The values of inbreeding coefficient F IS for regions and loci are shown in Table 3. F IS values were significant for the following markers within regions: ssrQrZAG31 values point to significant deviations from Hardy-Weinberg equilibrium (HWE) in all three regions.Values for ssrQrZAG87 and ssrQrZAG112 are also significant for region represented by CSO Kosovac, while ssrQrZAG108 is only significant for CSO Plešćice.
Parameters of genetic diversity of the analysed regions (CSOs) are shown in Table 4.The regions are quiet uniform across all parameters, with most private alleles belonging to CSO Petkovac, and least to CSO Kosovac.Allelic richness (N ar ), corrected for different sized samples with rarefaction index, does not show significant differentiation between regions, nor do H O and H E .
Genetic distance parameters are shown in Table 5. Nei's standard genetic distances between regions are almost equal, and differentiation between regions is not significant.

a2) Analysis of Molecular Variance (AMOVA)
The results of AMOVA between regions are shown in Table 6.There is no significant differentiation between regions, with 99.96% of diversity lying at the within-region level, meaning that the majority of diversity is caused by differences between individual trees of all three regions.

a3) Spatial Autocorrelation Analysis
Spatial autocorrelation analysis was conducted separately for three regions, based on the analysis with nSSRs and Gauss-Krüger coordinates of original plus trees.The results are shown in Figure 2. Geographical distances were divided in ten classes in which pairs of trees were distributed, according to their mutual geographical distance.Average kinship coefficients for co-dominant markers between all pairs in each class [42] were calculated.Figure 2 shows upper and lower boundaries of 95% confidence intervals.Those intervals represent the boundaries above or bellow which, for a certain geographical distance, pairs of individuals are more or less related then pairs of randomly chosen individuals.The Figure shows that in the studied regions, based on our sample of selected plus trees, we found no geographical genetic structure.There was no class of geographic distance detected, containing pairs of individuals significantly more or less related then randomly chosen pairs of individuals.

b) Chloroplast Microsatellites (cpSSrs) b1) Distribution of Haplotypes
Table 7 shows the definition of cpSSR haplotypes found in three regions, based on nine cpSSR markers.Marker µkk3 was monomorphic and was therefore excluded from further analysis.The numbers represent allele size in base pairs (bp).Table 7 also shows numbers of individuals within regions, belonging to different haplotypes.In haplotypes H07, H14, H15, H17 and H18 a specific long allele for µdt4 was found.The authors in [23,54,55] did not mention encountering such an allele in their research, but it was definitely found in the inventory of Austrian oak populations [56].When this allele is included in statistical analysis in its full size, it strongly influences calculations, especially those based on SMM models.In Provan et al. [57], the authors  also encountered an unusually long allele, sequenced it and determined that it was due to an indel of a sequence close to the microsatellite, so they treated this entire indel as a single mutation step.We suspect that something similar occurred in the sequence of µdt4 marker, so following the example from Provan et al. [57] we also corrected the size of this allele, taking the difference of 18bp between allele 148 and 166 as a single mutation step in all further statistical analysis.

b2) Lineages b2.1) Cluster Analysis (UPGMA)
Figure 3 shows UPGMA dendrogram of cpSSR haplotypes based on absolute distances (D AD ).Based on bootstrap values it is visible that maternal lineages L3 and L4 create separate branches with strong statistical support, and that lineages L1 and L2 are more related.
The distribution of individuals across the lineages for each region is shown in Table 8.All but three individuals from CSO Petkovac belong to L1, while clones from CSO Kosovac are divided between lineages L1 and L2.Most clones from CSO Plešćice belong to the separate lineage L3, with some clones belonging to L1 and L2.For each CSO there is at least one individual belonging to L4, with the unusually long allele for µdt4.Spatial distribution of individual plus trees marked with the colour of lineage they belong to is shown in Figure 4. Trees belonging to different lineages are not randomly distributed in space, but rather grouped by lineages.For example, all plus trees from CSO Plešćice, belonging to lineages L1 and L2 were selected in Forest District "Garešnica", Management Unit "Međuvođe -Ilovski lug".L2 Trees included in CSO Kosovac are distributed evenly across the entire region's selection area; in the eastern part together with L4 tress, in the middle part with L1 trees, while L1 trees dominate over the western part.Almost all trees selected for CSO Petkovac belong to L1 lineage, with only individual L2 and L4 trees and a single tree with H12 haplotype, which does not belong to any lineages.In two CSOs where there is more than one L4 tree, they were also localized to a smaller area.

b2.2) Median-Joining Network of Haplotypes
Figure 5 shows median-joining network of haplotypes.The sizes of the circles that mark haplotypes are proportional to the numbers of individuals belonging to that haplotype, and the colours in the circles mark the regions (CSOs) and are described in the legend.The size of the circle sector in a certain colour is proportional to the number of individuals of the corresponding region belonging to that haplotype.Lines connecting the haplotypes mark the number of mutation steps leading to the next haplotype.The colours of the lines mark cpSSR markers in which the change of size (bp) in that mutation step occurred, leading to the next haplotype.The colours are also described in the legend.In case when in a certain cpSSR the change in size bigger then 1bp occurred between two haplotypes, a number above the line marks the difference in bp.Haplotypes belonging to lineages defined by UPGMA algorithm are framed by blue dotted lines for comparison, and marked by numbers 1-4.also shows median vectors (mv 1-4), representing biological equivalents of possible non-sampled or extinct ancestral haplotypes, mostly found between groupings of sampled haplotypes.

b3) Genetic Diversity and Differentiation
Parameters of genetic diversity in the studied regions for cpSSRs are shown in Table 9. Greatest haplotype diversity presented as the number of haplotypes per the number of individuals (n h /n) is observed in CSO Plešćice (0.432), followed by CSO Kosovac (0.270), while CSO Petkovac was most uniform with 0.180 haplotypes per individual.Similar situation is found in other parameters: CSO Plešćice has the greatest number of private haplotypes (n ph ), double then other CSOs, as well as the highest effective number of haplotypes (n E ), haplotype richness (n hr ) and unbiased estimate of haplotype diversity (H E ).Frequency of the most frequent haplotype was almost equal for CSOs Plešćice and Kosovac and somewhat higher for CSO Petkovac.In the metapopulation, the most frequent haplotype was H01, the overall n h /n was 0.226 and n E was 7.719.
Parameters of average within-region diversity (h S, v S ), total diversity (h T, v T ) and genetic differentiation (G ST , N ST and R ST ) are shown in Table 10, as well as their comparison, considering three different methods for calculating genetic differentiation: fiGUre 5. Median-joining haplotype network.H01-H28; haplotypes, mv1-mv4; median vectors.(SMM) (taking into consideration genetic distances in individual cpSSR loci) -R ST .There were no significant differences between methods (A) and (B) and (A) and (C) for the calculation of parameters of average within-region diversity (h S and v S ) and total diversity (h T and v T ).However, there are significant differences between N ST and G ST , as well as between R ST and G ST , which points to the existence of significant genetic differentiation between these regions, for cpSSR loci.

a2) Analysis of Molecular Variance (AMOVA)
The results of AMOVA based on cpSSRs are shown in Table 11.Two methods were used for variance components partition: (1) The number of different alleles (F ST ) and ( 2) the sum of squared size differences (R ST ).
In both versions, the between-region component of diversity is significant, confirming that the regions are differentiated regarding cpSSR loci.Values obtained by both methods are similar and highly significant.

AMOVA of Haplotypic Lineages, with nSSrs
The results of AMOVA for partition of nSSR diversity in components within and between UPGMA haplotypic lineages are shown in Table 12.
In the within-lineage component lies 99.29% of diversity.The between-lineage component is not statistically significant.

Analysis with Nuclear Microsatellites (nSSrs) Genetic Diversity and Differentiation
Table 13 provides a comparison of H O and H E values of nSSR loci from this study with values from various studies of Quercus robur natural populations.
The values of genetic diversity parameters for most commonly used markers ssrQpZAG9 and ssrQpZAG36 are similar to those from other studies in Croatia [9] and all around Europe [15,[58][59][60][61][62][63].Marker ssrQrZAG87 was also in concordance with other studies, with slightly bigger difference between H O and H E than in Lepais et al. [64], but lesser than in Neophytou et al. [58].Marker ssrQpZAG110 was lower than in other studies a nd ssrQrZAG112 was higher, but in both markers in this study, H O and H E were   2).This was especially pronounced in this study, although the values were quiet low in other studies as well.This locus also showed significant differentiation between regions in this study (Table 2).The specificity of this marker is that it differentiates between pedunculate and sessile oak based on H O and H E on the population level.These values are systematically low for pedunculate oak and high for sessile oak populations from the same area [59].Marker ssrQrZAG112 shows similar patterns, but vice versa: high for pedunculate and low for sessile oak.In this study ssrQrZAG112 values were very high, but same as the average in other Croatian populations [9].Since plus trees were all selected in lowland areas, in a typical pedunculate oak habitat, with not much possibility for hybridization with sessile oak, very low ssrQrZAG96 and very high ssrQrZAG112 values are not surprising.Markers ssrQrZAG31 and ssrQrZAG108 show significantly lower H O then H E in this study.Similar result, although not so pronounced, was shown by ssrQrZAG31 in the study of twelve populations from Portugal, Spain, Ireland, Austria, France, Italy, Slovenia and Serbia [59].In the same study, ssrQrZAG108 had even bigger differences between H O and H E then in this study.F ST value for this locus also demonstrates significant differentiation between the regions, although in lesser extent then in a similar study of Bosnian populations [65].In that research all the loci showed significant differentiation between populations (F ST ) and inbreeding coefficients (F IS ), which the authors explained by intensive anthropogenic influence and populations' fragmentation throughout history, but they also presumed that the pattern of diversity was partially influenced by different ecological conditions and remnants of historical migration processes.In our study, only two loci are significantly differentiated between regions, while F ST values for other loci, as well as the overall F ST, are not significant (Table 2).Lack of differentiation usually points to effects of balancing evolutionary factors, like gene flow and balancing selection (similar selection pressures acting in all studied populations).It can also result from common descent, in absence of divergent evolutionary factors like random drift or divergent selection (adaptive differentiation due to different selection pressures acting in the studied populations).
According to Greogorius et al. [66], generally speaking it is expected that the effect of gene flow, as a balancing factor, is manifested in a greater number of loci, as compared to balancing selection, which usually affects only a few loci.The effect of common descent is also mainly visible in a smaller number of loci, because it is not likely for large parts of the genome to stay intacted through a great number of generations.In this study, regions were analysed with eight nSSR loci, most of which demonstrate lack of differentiation (Table 2).Regions were not significantly differentiated for allelic richness, nor for H E and H O .The AMOVA betweenregion component was very small and not significant (Table 6), Nei's distances between regions were almost equal, and F ST values between pairs of regions were not significant (Table 5).Taking all that into consideration, we can conclude that between these regions gene flow is the prevailing balancing factor, especially because in this whole area, the distribution of pedunculate oak is continuous enough to provide a corridor for gene migration.This is confirmed by recent research of 17 Croatian natural oak populations, with 10 nSSRs [9], where also no differentiation between natural populations from this area was found.Studies of quantitative traits in open pollinated progeny trials from these CSOs show a similar pattern: no significant differentiation between regions [4][5][6]8].
Inbreeding coefficients (F IS ) show significant deviations from HWE for two loci in CSO Plešćice, three loci for CSO Kosovac and one locus in CSO Petkovac (Table 3).It led to overall significant deviations from HWE in all three regions.F IS values for AG31 [59] are very high for all three CSOs, corresponding to great differences between H O and H E in this, but also in other studies [59].High F IS values point to inbreeding only in case when they are significant in most loci, which was the case in Bosnian populations [65].Otherwise, high F IS at only certain loci can point to the selection for these loci, selection for genes linked to these loci or the occurrence of null alleles.We used the MICRO-CHECKER programme [67] to check for genotyping mistakes and it showed a possibility that high F IS values for ssrQrZAG31, ssrQrZAG87 and ssrQrZAG108 were caused by null alleles, especially in ssrQrZAG31, where estimated values for null allele frequencies rise up to 35% in all three regions.Therefore, in this locus obviously null alleles were the main reason for significant F IS values.Estimated frequencies for locus ssrQrZAG87 and ssrQrZAG108 are much smaller, but there is reason to suspect that null alleles influenced F IS values for these loci as well.
Another possible reason for significant F IS values only on individual loci is, as mentioned above, the selection on those loci or the genes to which the locus is linked.Locus ssrQrZAG87 is situated in linkage group LG2 of Quercus robur map, in one of the genome regions coding for bud burst.Considering its position, it might be possible that high F IS value for this locus in CSO Kosovac comes from the uniformity of phenological forms in that CSO (almost all clones are early leafing [68][69][70]), as opposed to other two CSOs with a much wider range of leafing forms.However, in this study it is much more likely that null alleles caused those significant F IS values.In other studies, significant F IS values were also often accompanied by significant frequencies of null alleles [15,58].Additionally in this study, most F IS values are positive, i.e. there is a slight, although for most loci not significant, homozygote surplus.Considering that for each CSO plus trees were selected on the level of seed region, i.e. a wider range that covers more populations (stands within forest districts and management units), joint analysis of all the trees from a CSO as one population (called region in this study) can lead to the apparent surplus of homozygotes, caused by subpopulation structure, or the socalled Wahlund effect.Overall genetic diversity for nSSR loci encompassed in these three CSOs is similar to the diversity of natural pedunculate oak populations of this area.

Spatial Structure of nSSr Genetic Diversity
Genetic diversity spatial patterns in plants are mostly determined by their reproduction system and life form.
Weakest geographical structure, assessed with biparentlyinherited markers, is found in forest tree species, which are mostly outcrossing and have a long lifespan [71].Considering that oaks belong to that group, studies with nSSRs in oaks have confirmed weak spatial structure of their diversity, usually confined to 30-50 m distances between trees [15,61,[71][72][73].Weak spatial structure in oaks is probably the result of two opposing forces (not considering selection, as an additional factor forming genetic diversity).First, limited natural seed-flow (by gravity and animals) results in groups of related individuals.On the other hand, strong pollen-flow has a reverse balancing effect and disrupts the creation of related groups, resulting from limited seed-flow.Computer simulations [73] confirmed this scenario.In another study [15], it is further confirmed that weak geographic structure in oaks also results from weak survival of inbred individuals, due to inbreeding depression [15,60].Considering all that, the results of spatial analysis are as expected, especially since one of the selection criteria was the distance of minimum 50 m between the selected trees.Nonetheless, since there is significant pollen-flow in pedunculate oak, and since the trees were not sampled randomly, but selected based on multiple shared favourable traits, we wanted to confirm that similar phenotypes of trees from closer distance classes are not the result of their relatedness.The results confirmed that pairs of individuals within any distance class are not significantly more related that pairs of randomly sampled individuals (Figure 2).

Chloroplast Microsatellites (cpSSr)
In angiosperms, chloroplast genomes are inherited maternally and represent a useful tool for analysis and differentiation of populations considering postglacial recolonization routes.
A comprehensive research of chloroplast diversity of main European oak species from 2613 populations was conducted with PCR-RFLP analysis.Thirty-two cpRFLP variants were found grouped in six main maternal lineages.Their distribution, along with some results of palynological studies, were used to reconstruct recolonization routes from refuges after the last glacial [17,18].Some Croatian populations were also included in the study by Bordács et al. [74], and analysed in detail by Slade et al. [75].In the area of Croatia, Bosnia and Herzegovina, Montenegro, Albania and Kosovo they found seven cpRFLP haplotypes, four belonging to Balkan refuge lineage, two to Apennine, and one to Pyrenees or Southern Alpes lineage.In general, total diversity, as well as within-population chloroplast diversity of the Balkan area, is one of the highest in Europe [18].
Chloroplast markers can be of great use in tracking geographical origin of individuals and populations and therefore useful for certification of seed sources, control of wood and timber market, forensics etc., especially for species of great economic value like pedunculate oak [76,77].Since cpRFLP analysis is complicated, laborious, demands high quality DNA and sometimes has replicability issues, there was a need for a simpler method.The cpSSR markers were developed and comparatively tested with cpRFLP markers in the same populations [23,55].It has been found that the results are abutting, if not more precise with cpSSRs, with resolution increasing with added cpSSR loci.
Total genetic diversity (h T ) of these three regions, assessed with nine cpSSR markers in 124 individuals is very high (h T =0.95), the same as in the analysis of overall Croatian natural populations [9] (340 individuals from 17 populations).The h T of French populations with 10 cpSSRs was 0.65, but based only on 30 individuals, targeted to include certain cpRFLP haplotypes.In a study of Grivet et al. [54] on 367 individuals from 92 populations from France and the Iberian Peninsula, with six cpSSR markers, eleven haplotypes were found.If we had used only these six markers, we would have gotten 22 haplotypes for our three regions and 124 individuals.This points to great cpSSR diversity of our studied area.Since Croatia is situated on the crossroad of different recolonization routes [75], relatively close to some main refuges (Balkan and Apennine peninsula), it is not surprising to find great and well-preserved haplotypic diversity.The reason for differences between AMOVA partition of diversity between and within regions in this study (within-region diversity greater then between-region) and the study of overall Croatian populations [9] is in the sampling.We had many plus trees sampled in three rather close regions, compared to a smaller number of trees in more separate populations in that study, so it is expected that between population the component in Morić [9] will be higher.Nevertheless, both in this study and in Morić [9], differentiation is highly significant.
Considering a great number of analysed individuals per region and nine analysed cpSSRs, in this study resolution is high, so 28 haplotypes were grouped by UPGMA in four maternal lineages.Comparing our results with populations from study of Slade et al. [75], belonging to the same seed regions (population Zdenački Gaj for CSO Plešćice, Slatina and Đurđenovac for CSO Kosovac and Vrbanja and Gunja for CSO Petkovac) we presume that more closely related cpSSR lineages L1 and L2 belong to cpRFLP lineage 5, which is not clearly delineated and originates either from Balkan or Apennine refuge.In both studies (this study and study of Slade et al. [75]) most diverse region is the one were trees were selected for CSO Plešćice, were lineages from all three refuges come in contact.The region for CSO Kosovac was somewhat poorer in haplotypes, while CSO Petkovac region was the most uniform.Lineages L3 and L4 are strongly differentiated both from L1 and L2 and between each other, so it is quite possible that they originate from other refuges and belong to cpRFLP lineages 2 (Apennine refuge) or 7 (Pyrenees or Southern Alpes refuge).Lineage L3 is mostly found in northwestern part of Croatia (CSO Plešćice) where, according to Slade et al. [75] both cpRFLP 2 and cpRFLP 7 lineage is found, with 7 being a bit more frequent.
In spite of the correction made in this study for the long allele (see Material and Methods), haplotypes of L4 lineage were strongly grouped together, pointing to their different origin from L1 and L2 lineages, as well as L3.
The cpSSR haplotype distribution can be formed not only by recolonization routes, but also by reproductive material transfer.For example, in the CSO Plešćice region in management unit "Međuvođe -Ilovski lug", we found haplotypes of L1 and L2 lineages, common in a more eastern forest complex.Also in forest district "Slatina" (CSO Kosovac) almost all selected trees belong to L1, although this forest district is situated further west from the dominant complex of this lineage.Moreover, in the same region to the east, which is closer to the L1 dominant complex, L2 individuals are prevailing.While it is possible that such distribution is caused by recolonization route admixture or single longdistance dispersal events [78], we cannot exclude the possibility of seed transfer from eastern to further western regions, especially because L1 individuals are mostly found within a complex known for pedunculate oak stem quality, for example in Spačva Basin.Reproductive material from that area was exported to other states for its quality.For example, Slavonian pedunculate oak was intensively planted in Germany at the end of the 19th century and entire stands of that origin are still present in that area, recognizable by stem quality and late flushing.Therefore, it is quite possible that reproductive material from that area was transferred also within Croatia.Based on cpRLFP analysis across Europe, anthropogenic influence on the distribution of genetic diversity of pedunculate oak is stronger than on any other oak species because of its economic value [17,18].

AMOVA of Haplotypic Lineages, with nSSrs
In an extensive overview of studies of different genomes [79], the authors compared diversities of chloroplast, mitochondrial and nuclear markers in different plant species and found that in most cases differentiation found with cytoplasmic markers does not correspond to the one found with nuclear markers.Also, in oak species, gene flow by pollen is approximately 200 times greater than by seed.This difference is manifested in weak differentiation of populations by nuclear, biparentally inherited markers, as compared to maternally inherited cytoplasmic markers.It is presumed that primarily nuclear diversity corresponded to cytoplasmic markers, but was later erased by strong gene flow by pollen and rearranged by selection pressures.
According to Kremer et al. [19], parallel analysis with nuclear and cytoplasmic markers eventually shows weak concordance when diversity of nuclear markers is not strongly influenced by selection.With this in mind, we wanted to compare cpSSR diversity with presumably neutral nSSR loci.The assessment of genetic differentiation is hugely influenced by the choice of markers and by sampling [79].In our study, considering the small number of sampled regions and many sampled trees per region, differentiation for nSSR was virtually non-existing.As mentioned before, there is a corridor for gene flow between these regions, which obviously led to balancing genetic diversity of nSSRs in the entire studied area.Considering weaker possibilities of seed transfer between regions compared to pollen, there was significant differentiation for cpSSRs.AMOVA for partitioning nSSR diversity between UPGMA chloroplast lineages showed that only about 1% is explained by between-lineage components and it was not significant.Obviously, strong gene flow by pollen erased recolonization patterns of nuclear diversity.

CONCLUSiONS
Molecular analysis with nuclear and chloroplast microsatellite markers of plus trees selected in three regions (upper Posavina, lower Posavina and Pokuplje, and central Podravina) within two seed zones (first two regions -seed zone Posavina, central Croatia and Pokuplje, third regionseed zone Podravina and Podunavlje) and included in three clonal seed orchards of pedunculate oak led to the following findings: 1. Analysis with eight nSSRs shows homogeneity of presumably neutral genetic diversity in the entire studied area, with no genetic differentiation between regions.The regions are not differentiated by genetic parameters, and AMOVA between-region component was very small and not significant.The results point to the existence of strong gene flow between regions, as prevailing balancing factor in this area.2. Some nSSR loci (AG31, ssrQrZAG108 and AG87) displayed deviations from Hardy-Weinberg equilibrium in some or all CSOs.The most probable cause of increased homozygosity in these loci are null-alleles.Additional reason for positive F IS values for these and other loci is probably the sampling method.The analysis was conducted on the level of seed regions, which are themselves composed of more populations, which probably resulted in substructure of regions, i.e.Wahlund effect.

Spatial autocorrelation analysis, with nSSRs and
Gauss-Krüger coordinates of original plus trees showed that for no class of mutual distance between the selected plus trees within regions pairs of trees were more related then randomly selected pairs of trees.This points to good sampling strategy and successful avoidance of selecting related genotypes.4. Molecular analysis with nine polymorphic cpSSRs resulted in 28 haplotypes, belonging to four maternal lineages, by UPGMA algorithm.Lineages L1 and L2 are more related and lineages L3 and L4 are each grouped separately, with greater support.With these markers, significant differentiation was found between regions.Greatest haplotypic diversity was found in CSO Plešćice, smaller in CSO Kosovac and the smallest in CSO Petkovac.Probable cause is the distribution and admixture of recolonization lines from different refuges, which is more pronounced in western parts of Croatia, and possible reproductive material transfer.Analysis with these markers provides good basis for tracking the origins of reproductive material.These findings lead to the following conclusions and recommendations.
The results of this study with nSSRs show no differentiation between regions in this area and are in concordance with previous studies on quantitative traits [3][4][5][6][7][8][9] and a recent study of overall Croatian populations with nSSRs [9].CpSSR analysis showed geographic structure of haplotypes, but it is connected to recolonization routes and according to another research [19] has no connection with adaptively important traits.Moreover, in this research, smaller differentiation for cpSSRs was observed between the regions belonging to other seed zones (CSO Kosovac and CSO Petkovac) than between those belonging to the same zone (CSO Plešćice and CSO Petkovac).Therefore, the change implemented in current Croatian bylaw on forest tree species provenances (NN 147/11, 96/12, 115/14 and 114/15), by which the transfer of forest reproductive material of pedunculate oak is permitted within the entire seed area, and not just within the seed zone, is justified.This is especially true in the light of climate change, when needs for changes in policy of forest reproductive material transfer are discussed on much larger scales, on the level of whole Europe [80].
We conclude that genetic diversity levels encompassed in these CSOs are comparable to those in natural populations [9] and provide a good basis for the production of quality and genetically diverse progeny, if other conditions for optimal reproduction are achieved.For now, the clones in these CSOs still have not reached their full size and the clones are not equally contributing to the crops [25].Genotyping with nSSRs also provides the foundation for molecular analysis of the progeny, which would provide a very precise estimation of genetic diversity produced in the CSOs in the given year, and level of contamination with pollen from outside sources.The results from such analysis could then be compared to simpler and much cheaper estimations, based on flowering and fruiting, to find the most appropriate method for assessing genetic diversity of crops.Along with flowering and fruiting observations, it is necessary to regularly observe climatic and other environmental conditions in the CSOs, so that the observed trends could be correlated to various factors and possibly explained.The clones in some of these orchards are grouped by phenology, which creates substructures and prevents random reproduction [69,70].According to our results, phenologically compatible clones from all three orchards could be combined in a separate orchard, in order to achieve maximum synchrony and enhance genetic diversity of the crops.

fiGUre 1 .
fiGUre 1. Position of Quercus robur clonal seed orchards in Croatia

2 .
fiGUre 2. Charts of spatial autocorrelation analysis, separately for three CSOs, with 10 distance classes per CSO and eight nSSR loci.Dotted lines represent 95% confidence intervals.

Figure 5 fiGUre 3 . 4 .
fiGUre 3. UPGMA dendrogram of cpSSR haplotypes (cpSSR1S) based on absolute distance (DAD).Boostrap support values of 1,000 replicates higher than 50% are given above the branches.Putative haplotype lineages (L1 to L4) are indicated on the right.

tAbLe 2 .
Allelic diversity of the microsatellite loci scored in three Quercus robur CSOs.
a , the total number of alleles; H O , observed heterozygosity; H E , expected heterozygosity; PIC, Polymorphic Information Content #*** Significant at P < 0.001; ** Significant at P < 0.01; * Significant at P < 0.05; not indicated -non-significant value

tAbLe 3 .
Inbreeding coefficients f across eight microsatellite loci in three CSOs of Quercus robur.
n, sample size; N avg , average number of alleles per locus; N ar , allelic richness; N pr , number of private alleles; H O , observed heterozygosity; H E , expected heterozygosity; p(KW), probabilities of the Kruskal-Wallis tests among CSOs

tAbLe 4 .
Genetic variation within Quercus robur CSOs at eight microsatellite loci

tAbLe 6 .
AMOVA analysis for the partitioning of microsatellite diversity among and within CSOs.

tAbLe 7 .
Definition of cpSSR haplotypes and their distribution in the CSOs, based on nine cpSSR markers.The numbers represent allele lengths in base pairs.

tAbLe 8 .
The distribution of individuals from CSOs per haplotypic lineages.
n -number of individuals; n h -number of haplotypes; n h /n -number of haplotypes per number of individuals; n ph -number of private haplotypes; n E -effective number of haplotypes; n hr -haplotype richness; H E -unbiased haplotype diversity; f h -frequency of the most common haplotype (H03 in Plešćice, H01 in Kosovac, H02 in Petkovac, H01 in the metapopulation)

tAbLe 11 .
AMOVA analysis for the partitioning of cpSSR diversity (A) based on the number of different alleles (FST), and (B) based on squared size differences (RST).
a Test of the hypothesis that v S , v T , and N ST differ from h S , h T , and G ST , respectively b Test of the hypothesis that v S , v T , and R ST differ from h S , h T , and G ST , respectively ns not significant (P>0.05),*significant at 0.01<P<0.05),**significant at P<0.01, *** significant at P<0.001 tAbLe 10.Levels of diversity and differentiation based on (A) unordered alleles, (B) ordered alleles, and (C) stepwise SSR differences.nsnot significant (P>0.05),* significant at 0.01<P<0.05),** significant at P<0.01, *** significant at P<0.001

tAbLe 12 .
AMOVA analysis for the partitioning of nuclear SSR diversity within and among putative cpSSR haplotype lineages.

tAbLe 13 .
Comparison of genetic diversity on nSSR loci between three CSOs from this study and natural populations of Quercus robur from other studies.almost equal.Marker ssrQrZAG96 had low values for both heterozygosities and PIC (Table