SciELO - Scientific Electronic Library Online

vol.37 issue1Vine balance: a study case in Carménère grapevines author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand




Related links


Ciencia e investigación agraria

On-line version ISSN 0718-1620

Cienc. Inv. Agr. vol.37 no.1 Santiago Apr. 2010 

Cien. Inv. Agr. 37(1):151-160. 2010



Analysis of genetic diversity in Argentinian heterotic maize populations using molecular markers

Análisis de diversidad genética de poblaciones heteróticas de maíz argentino utilizando marcadores moleculares


Marcelo Morales1, Viviana Decker1, and Leonardo Ornella1,2

1EEA Pergamino, Instituto Nacional de Tecnología Agropecuaria, CC31 (2700), Pergamino, Argentina.
2Centro Internacional Franco-Argentino de Ciencias de la Información y de Sistemas (CIFASIS), 27 de febrero 210 bis (S2000EZP) Rosario, Argentina.


Over the past three decades, traditional Argentinean Orange Flint maize cultivars have been replaced by the higher yielding U.S. Yellow Dent germplasms. However, fint cultivars are potentially resistant to biotic and/or abiotic stress. Thus, knowledge of genetic diversity and relationships among fint inbred lines would help reduce genetic vulnerability and broaden the genetic base of crops in national improvement programs. In this study, we report the analysis of 25 inbred Orange Flint germplasms and one dent using 21 microsatellite markers or Simple Sequence Repeats (SSR). The aim was to assess genetic diversity among these accessions and evaluate the usefulness of SSR markers for defning heterotic groups in temperate germplasm. Genetic diversity values for fint germplasm (25 inbreeds) was relatively high. The number of alleles per locus was 5.14 and expected heterozygosis (He) was 0.68. When testing for genetic differentiation among the four heterotic populations established by topcross, twelve loci from a total of twenty-one displayed signifcant P-values. Even though we cannot observe a signifcant agreement between groupings based on topcross and clustering based on molecular data. On the other hand, Bayesian grouping (STRUCTURE software) performed better when compared to the clustering based on genetic distance (UPGMA-Modifed Roger’s Distance).

Key words: Cluster analysis, microsatellite, Zea mays.


Desde las tres últimas décadas, las variedades tradicionales argentinas de maíz Cristalino Colorado han sido reemplazadas por germoplasma más competitivo de origen norteamericano. Sin embargo, los cultivares fint son una fuente potencial de resistencia a estrés biótico y abiótico. En consecuencia, el conocimiento de la diversidad genética y relación entre las líneas ayudaría a reducir la vulnerabilidad genética y aumentar la base genética del cereal en los programas de mejoramiento nacionales. En este trabajo se reporta el análisis de 25 líneas de germoplasma Cristalino Colorado y 1 línea de maíz dentada utilizando 21 marcadores microsatélite o SSR (Simple Sequence Repeats). El objetivo fue evaluar la diversidad genética entre dichas entradas y la utilidad de los marcadores SSR para defnir grupos heteróticos en germoplasma de clima templado. La población de 25 líneas de maíz Cristalino Colorado presentó valores relativamente altos de diversidad genética: Número de alelos/locus = 5,14 y He = 0,68. El test de diferenciación génica, aplicado sobre las cuatro poblaciones heteróticas establecidas por topcross, reveló 12 loci, de un total de 21, con valores de P, signifcativos. Aunque no se observó un acuerdo importante entre los agrupamientos basados en información molecular y los grupos heteróticos establecidos por topcross, el agrupamiento bayesiano (programa STRUCTURE) presentó un mejor comportamiento respecto al agrupamiento basado en distancia genética (UPGMA-Modifed Roger’s Distance).

Palabras clave: Análisis de conglomerados, microsatélite, Zea mays.



The strategies used in maize breeding programs (Zea mays L.) are frequently characterized by a decrease of genetic diversity in the pool of germplasms and an increase in the genetic evenness in cereal production (Lee, 1998). This might cause important problems, particularly sensitivity to new diseases and/or a decreased tolerance to high temperatures or drought (Du-vick, 1989).

Argentina is ffth as maize produce country, second as maize export country. A strategy frequently used in Argentinean improvement programs is to take advantage of the hybrid vigor of crossings between the national Cristalino Colorado material and the U.S. yellow dent material (Eyhérabide et al., 2006). Dent hybrids, developed and/or introduced in Argentina, follow mainly the Reid Yellow Dent (RYD) vs. Lancaster Sure Crop (LSC) pattern; they present better behavior with respect to grain yield, especially in favorable environments, and are appreciated for their dry milling quality. On the other hand, although orange fint hybrids have lower yields than fint × dent and dent × dent crosses, they are appreciated for the hardness of their endosperm (Robutti et al., 2000), their biological value (Eyérabide et al., 2006) and their resistance to local diseases such as Mal de Río Cuarto (Morata et al., 2003). Recent research in the United States has also shown that Argentinean germplasm presents resistance to Gib-berella and Fusarium ear rots (Presello et al., 2004) and has lower afatoxin concentrations than fint hybrids (Ochs, 2005). Consequently, knowing the constitution of Cristalino Colorado germplasm and understanding the relations between the lines would help to reduce the genetic vulnerability and increase the genetic base of national programs, allowing the assignment of new lines of heterotic patterns previously determined (Hallauer and Miranda, 1988).

Variations in the DNA sequence have been used as molecular markers in plants and animals during the last two decades (Korzun, 2003). Moreover, they have been used as a tool to determine new heterotic groups and/or assign new materials to pre-existing heterotic groups (Melch-inger, 1999; Reif et al., 2003). It has been reported that microsatellites or Simple Sequence Repeats (SSR) present the advantages of repro-ducibility, discrimination and low cost/beneft ratio with respect to other markers (Pejic et al., 1998; Smith et al., 1997). They have consequently been proposed for the characterization of genetic resources (Pejic et al., 1998; Smith et al., 1997). The objectives of the present study were to determine the levels of genetic diversity and relationships between lines of the Cristalino Colorado germplasm and to evaluate the usefulness of microsatellites to defne heterotic groups in a temperate climate germplasm.

Materials and methods

This research involved 26 lines selected out of 48, previously arranged in four heterotic groups by test cross with four synthetic populations (Nestares et al., 1999; Eyérabide et al., 2006). The four synthetic populations used as testers were: sB73 and sMo17 from the Reid × Lancaster pattern and HP3 and P5L2 from the local fint pattern (Nestares et al., 1999). All lines, except B73, evaluated in this work (Iowa Stiff Stalk Synthetic) were developed by the Argentinean INTA (Instituto Nacional de Tecnología Agropecuaria) from different origins of Cristalino Colorado maize, mainly local races (Table 1). The election of lines was based on seed availability and the degree to which the four groups represent the entire population. For each line, the DNA from young and fresh leaves was extracted in "bulk" from fve plants by the CTAB method reported by Hoisington et al. (1994). Although the lines used in this study are homozygotes, several plants were used in the extraction to avoid a possible contaminating seed. The DNA quality and amount was verifed by electrophoresis in 0.8 % agarose gels. The quantifcation of each extraction was made by comparing to DNA samples of uncut lambda phage of known amount and by fuorescence with ethidium bromide.

The primer sequences used for PCR amplifca-tion were selected from the MaizegDB database ( The mi-crosatellite loci selected were chosen based on the size of the repetitions and their location, to obtain a representative sampling of the whole genome (Table 2). The PCR reaction was made in a fnal volume of 11 µl containing 10-20 ng of mould DNA, 0.1 mM dNTPs, 0.25 mM of primers (forward and reverse), 0.75 mM MgCl2, 0.025 U Taq DNA polymerase, and reaction buffer 1X (10 mM Tris-HCl pH 8.8, 50 mM KCl, and 1.5 mM MgCl2). Negative controls without DNA template in the reaction mixture were included in each PCR run. Amplifcations were carried out in a PTC-100 MJ thermocycler (MJ Research, Watertown, MA) with the following conditions: an initial denaturalization cycle at 94°C for 2 min; fve touch down cycles: 60 s at 94°C, 60 s at 65°C (decreasing 1°C per cycle) and 2 min at 72°C; 30 conventional cycles of 60 s at 94°C, 60 s at 60°C and 2 min at 72°C and, fnally, an elongation cycle at 72°C for 5 min. The amplifcation products were solved by electrophoresis in denaturing gels (6M urea) of 6% (w/v) acrylamide/bis-acrylamide solution (29:1) and detected by silver nitrate staining (Silver sequence Promega Biotech, Madison, WI). The different bands obtained were evaluated by visual inspection, a 25 bp DNA ladder (Life Technologies-gibco BRL) was used as a molecular weight marker. For the same primer, the products of different size were considered different alleles. The information obtained was coded in a worksheet for further analyses.

The number of alleles and the genetic diversity (expected heterozigocity) were estimated in each locus for a cluster of 25 orange fint lines; line B73 was not included in this analysis. The expected heterozygosis (He), sometimes known as PIC or polymorphic information content (Smith et ah, 1997), was estimated according to Nei (1978):

where p is the frequency of the i-th allele. The He value is defned as the probability that two alleles chosen at random within the same sample are different, and it shows the reach of the marker’s discriminatory power in considering not only the number of alleles but also their relative frequencies (Kostova et ah, 2006). The analysis was implemented by PowerMarker v3.25 (Liu and Muse, 2005). The level of genic differentiation among the four heterotic populations previously determined by the topcross method (Eyherabide et ah, 2006) was estimated with the program gENEPOP v.3.4, using the pre-set parameters and under the null hypothesis: "the allelic distribution is identical through all the populations" (Raymond and Rousset, 2004). This program allows us to obtain an unbiased P-value for each locus using an exact test (Raymond and Rousset, 2004).

We used cluster analysis was for the whole group of 26 characterized lines based on the Unweighted Pair group Method using Arithmetic Averages (UPgMA). The cluster analysis was implemented on the modifed Roger’s distance (Reif et ah, 2005):

where p.. and a are the frequencies of the i-th allele at the j-th locus in the two lines considered, a indicates the number of alleles for the j-th marker and m indicates the total number of analyzed loci. The calculations of genetic distance and the cluster analysis were made with the TPFgA software v1.3 (Miller, 1997). The cluster analysis was also carried out by hclus and stats packs of the R environment (http://www.r–, to estimate the cophenetic correlation (correlation between the distance values estimated during the tree construction and the values of initial distances) and identify the potential heterotic groups.

The program STRUCTURE was used as a second approach to determine the possible het-erotic groups from molecular data (Pritchard et al., 2000). STRUCTURE uses a Bayesian algorithm to infer the individual membership, maize lines in this case, to the different populations. The number of populations (K) was previously determined to equal 4. The main parameters of the program (and number of replications) were both determined in 1,000,000. A script in R language was implemented to determine the best agreement level between the clusters based on molecular data obtained in this work and the cluster based on the top crosses made by Ne-stares et al. (1999). The program allowed us to compare the number of individuals coinciding in the four groups determined on the basis of the molecular information and in the four groups determined based on the topcross method. The degree of association (coincidence) was estimated by Cohen’s Kappa coeffcient, provided in the psy pack (R project).

Results and discussion

The 21 polymorphic SSR markers used to estimate the genetic diversity of the population of 25 lines of Argentinean Cristalino Colorado maize allowed for the detection of 108 total al-leles. The number of alleles per loci varied from 2 to 14 with a mean of 5.14; the 108 alleles were suffcient to completely discriminate the 25 lines (Figure 1). The values of genetic diversity for each locus varied from 0.36 to 0.90 with a mean of 0.68 (Table 2).

These results are similar to the results obtained in previous studies made in maize, for example: Kostova et al. (2006) analyzed 41 Bulgarian lines with 18 microsatellites and obtained a mean of 9.1 allelic variants, Pejic et al. (1998) observed a mean of 6.8 alleles per locus in 33 American characterized lines with 27 SSR, while Bantte and Prasanna (2003), characterizing 23 tropical lines with 36 SSR, determined a mean of 3.25 alleles per locus.

The average value of He obtained in this work was also in agreement with the values obtained in the works mentioned, for example: Kostova et al. (2006) found a mean He of 0.71, while Pejic et al. (1998) reported a value of 0.72, and Bantte and Prasanna (2003) a value of 0.54. He gives an idea of the information available from the SSR loci and their potential to detect differences between lines based on their genetic relation. The differences among these studies may be attributed mainly to differences in sample size and the genetic base of the populations analyzed. We also considered the fact that microsatellites with repetitions of two nucleotides show a higher number of allelic variants; however, a heterozy-gosis value of 0.67 was obtained when the results of microsatellites with this number of replications were excluded from the analysis (phi001, phi026, nc013, phi119 and phi068). The number of alleles as well as the diversity values confrm the wide genetic base of the population analyzed in this work (Table 1) (Eyhérabide et al., 2006).

Finally, the level of genic differentiation among the four heterotic populations previously determined by the topcross method was evaluated; 12 loci showed statistically signifcant values (p < 0.05) (Table 2).

The identifcation of heterotic groups is essential in modern programs for genetic maize improvement, as it allows for selection of only those crossings expressing the maximum heterosis potential, which permits a more effcient use of germplasm (Hallauer and Miranda, 1988). The most used methods for establishment of heterotic patterns are top cross tests (de Azevedo Duarte et al., 2003; Nestares et al., 1999) and the dialellic analysis, not implemented very often due to the high number of crossings required (Pinto et al., 2001). It has been stated that microsatellite markers might complement or allow for the replacement of top cross tests in establishing new het-erotic patterns. According to Reif et al. (2003), if the program has generated a large number of lines and the heterotic patterns have not been determined yet, then the genetically divergent ger-mplasm may be identifed by molecular markers. Based on this information, feld tests may be planned more effciently and economically.

UPgMA clustering was applied on the modi-fed Roger’s distance or MRD based on the mi-crosatellite data. MRD values between the lines varied between 0.52 and 0.96 (with a mean of 0.79), while the value of cophenetic tree correlation was 0.65. In general, the cluster coincided with the germplasm origin (i.e., related lines grouped together, Figure 1); the rect.hclust function (stats package of R project) allowed us to determine possible heterotic groups in the dendogram 4 (Figure 1).

The program STRUCTURE was used as a second alternative to classify the lines according to the molecular data. As stated by Pritchard et al. (2000), this program presents advantages with respect to the methods based on genetic distance mainly because the inference of the parameters corresponding t o each group is made along with the inference of the membership degree of each individual to the groups. The groups determined by the molecular information (Figure 1 and Table 3) were compared with the four groups determined previously by topcross (Table 1) using a program implemented in R language (http://www.r– and Cohen’s Kappa coeffcient. Cohen’s Kappa co-effcient allows us to determine the degree of agreement between two methods or evaluators, taking into account the agreement expected only by chance (Cohen, 1960). In general, most reports use the cluster methods based on genetic distance (Reif et al., 2005). However, it was observed in this work that the cluster obtained by STRUCTURE showed a better degree of agreement than the UPgMA-MRD clustering when they were compared with the cluster based on topcross (κ = 0.33 vs. κ = 0.16). We can attribute this outcome to: i) the low value of cophenetic correlation (0.65), which indicates the degree of ft between the distances observed in the tree to the matrix of genetic distances, and/or ii) the best performance of STRUCTURE per se (Prit-chard et al., 2000).

According to quantitative genetics, hybrid vigor is partly attributed to loci presenting a heterozygote condition (Falconer and MacKay, 1996). Consequently, the alleles whose frequencies present signifcant differences between two diverging heterotic groups are the best candidates for involvement in the heterotic response. Therefore, a second cluster was made based on genetic distance, but using this time only those loci selected in the test of genic differentiation (Table 2). The 12 loci were suffcient to discriminate among the 26 genotypes, and although in this case the value of cophenetic correlation was 0.66, it was observed that the cluster was less consistent with the lines’ origins (Figure 2). The program STRUCTURE (Pritchard et al., 2000) was also used to infer the members of the four possible heterotic populations based on the information of the 12 loci selected by the genetic differentiation test (Table 4). The two cluster methods did not show a signifcant improvement in the level of agreement with the groups determined by topcross using the 12 loci

The assumption used to establish the heterotic groups based on molecular maker data is that the loci analyzed contributed in a similar fashion to heterosis, thus lines clustered together present a similar heterotic behavior independently of the crossing evaluated (Reif et al., 2005). However, it has been reported that, in genetic mapping experiments with hybrid progeny across Testers and generations, QTL detected with only one tester were not necessarily detected for the other two testers (Austin et al., 2000; Mihaljevic et al., 2005). We suspect that this could be the main cause of the low level of agreement between the cluster based on molecular data and heterotic groups based on topcross tests. Consequently, not only must those markers associated with the heterosis be selected for the cluster, but more refned cluster algorithms considering the situation previously mentioned must be also designed.

In conclusion, the relatively high genetic diversity values (i.e., expected number of alleles per locus and heterocigosis) confrm the wide genetic base of the material of origin. From the 21 loci analyzed, 12 showed signifcant p-values, with respect to the test of genic differentiation among the four heterotic populations previously determined by Nestares et al. (1999). Although the maximun likelihood clustering (Program STRUCTURE) showed a better behavior than traditional methods based on genetic distance (UPGMA- modifed Roger’s distance), in general a signifcant agreement was not observed between the molecular data and the cluster based on the topcross method. Results obtained, along with the bibliographic reports, show the need for designing more refned clustering algorithms, thus the molecular marker information may replace the feld tests for determining heterotic groups.


To Esteban Serra (FCByF-UNR), for his permission to use his laboratory; to graciela Nestares (FCA-UNR), for the feld data; to E. Tapia (FCEIyA-UNR), for her technical counselling and for allowing us to use her their technical support; to the Agencia Nacional facilities; to Adelaida Fernandez and the group de Promoción Científca and Tecnológica of Maize Improvement - EEA Pergamino, for (Argentina), for their fnancial support.


Austin, D.F., M. Lee, L.R. Veldboom, and A.R. Hallauer. 2000. genetic mapping in maize with hybrid progeny across testers and generations: grain yield and grain moisture. Crop Science 40:30-39.        [ Links ]

Bantte, K., and B.M. Prasanna. 2003. Simple sequence repeat polymorphism in Quality Protein Maize (QPM) lines. Euphytica 129:337-344.        [ Links ]

Cohen, J. 1960. A coeffcient of agreement for nominal scales. Educational and Psychological measurements 20:37-46.         [ Links ]

De Azevedo Duarte, I., J.M. Ferreira, and C.N. Nuss. 2003. Screening potential of three maize top cross testers. Pesquisa Agropecuária Brasileira 38:365-372.        [ Links ]

Duvick, D. 1989. Possible genetic causes of increased variability in U.S. maize yields. Pages 147-156. In: J. R. Yerson and P.B.R. Hazell (eds.). Variability in grain Yields: Implications for Agricultural Research and Policy in Developing Countries. Johns Hopkins Univ. Press, Baltimore, MD.         [ Links ]

Eyherabide, G., G. Nestares, and M. Hourquescos. 2006. Development of a heterotic pattern in orange fint maize. Pages 352–379. In: K. Lamkey and M. Lee (eds.). Plant Breeding: The Arnel R. Hallauer International Symposium: Blackwell Publishing.        [ Links ]

Falconer, D.S., and T.F.C Mackay. 1996. Introduction to Quantitative genetics. Fourth ed. 4. Longmans green, Harlow, Essex, UK. 480 pp.        [ Links ]

Hallauer, A., and J. Miranda. 1988. Quantitative genetics in Maize Breeding. 2nd edition. Ames, Iowa State University Press. USA. 468 pp.        [ Links ]

Hoisington, D.A., M. Khairallah, and D. gonzalez de León. 1994. Laboratory Protocols: CIMMYT Applied Molecular genetics Laboratory. Second ed., Mexico, D.F. Mexico.        [ Links ]

Korzun, V. 2003. Molecular markers and their application in cereals breeding. In: Marker Assisted Selection: A fast track to increase genetic gain in plant and animal breeding? FAO Conference 10.        [ Links ]

Kostova, A., E. Todorovska, N. Christov, V. Sevov, and A. Atanassov. 2006. Molecular characterization of Bulgarian maize germplasm collection via ssr markers. Biotechnology & Biotechnology Equipment 20:29-36.        [ Links ]

Lee, M. 1998. genome projects and gene pools: New germplasm for plant breeding? Proceedings of the National Academy of Sciences 95:2001-2004.        [ Links ]

Liu, K., and S.V. Muse. 2005. PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21:2128-2129.        [ Links ]

Melchinger, A. 1999. Genetic diversity and heterosis. Pages 99-118. In: J.Coors, S. Pandey, and J.T. Gerdes (eds.). The genetics and Exploitation of Heterosis in Crops. American Society of Agronomy, Madison, WI.        [ Links ]

Mihaljevic, R., C.C. Schon, H.F. Utz, and A.E. Melchinger. 2005. Correlations and QTL correspondence between line per se and testcross performance for agronomic traits in four populations of european maize. Crop Science 45:114-122.        [ Links ]

Miller, M. 1997. Tools for population genetic analyses TFPgA 1.30: A windows program for the analysis of allozyme and molecular population genetic data. Computer software distributed by the author. (Accessed: February, 2009).        [ Links ]

Morata, M., D.A. Presello, M.D.P. Gonzalez, and E. Frutos. 2003. Aptitud combinatoria entre líneas de maíz resistentes a mal de río cuarto. Fitopatologia Brasileira 28:236-244.        [ Links ]

Nei, M. 1978. Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics 89:583-590.        [ Links ]

Nestares, G., E. Frutos, and G. Eyherabide. 1999. Evaluación de líneas de maíz fint colorado por aptitud combinatoria. Pesquisa Agropecuária Brasileira 34:1399-1406.        [ Links ]

Ochs, B. 2005. Evaluation of Argentine maize hybrids and exotic temperate testcrosses envirorments. Texas A and M., TAMU College Station. 174 pp.        [ Links ]

Pejic, I., P. Ajmone-Marsan, M. Morgante, V. Kozumplick, P. Castiglioni, G. Taramino, and M. Motto. 1998. Comparative analysis of genetic similarity among maize inbred lines detected by RFLPs, RAPDs, SSRs, and AFLPs. Theoretical and Applied genetics 97:1248-1255.        [ Links ]

Pinto, R.M.C., A.A.F. garcia, and C. L. Souza Jr. 2001. Alocação de linhagens de milho derivadas das populações BR-105 e BR-106 em grupos heteróticos. Scientia Agricola 58:541-548.        [ Links ]

Presello, D., M. Reid, and D. Mather. 2004. Resistance of argentine maize germplasm to gibberella and fusarium ear rots. Maydica 49:73-81.        [ Links ]

Pritchard, J.K., M. Stephens, and P. Donelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945-959.        [ Links ]

Raymond, M., and G. Rousset. 2004. Genepop on the web. (Accessed: March, 2009).        [ Links ]

Reif, J.C., A.E. Melchinger, X.C. Xia, M.L. War-burton, D.A. Hoisington, S.K. Vasal, D. Beck, M. Bohn, and M. Frisch. 2003. Use of SSRs for establishing heterotic groups in subtropical maize. Theoretical and Applied genetics 107:947-957.        [ Links ]

Reif, J.C., A.E. Melchinger, and M. Frisch. 2005. Genetical and mathematical properties of similarity and dissimilarity coeffcients applied in plant breeding and seed bank management. Crop Science 45:1-7.        [ Links ]

Robutti, J., F. Borras, M. Ferrer, M. Percibaldi, and C.A Knutson. 2000. Evaluation of quality factors in argentine maize races. Cereal Chemistry 77:24-26.        [ Links ]

Smith, J.S.C., E.C.L. Chin, H. Shu, O.S. Smith S.J. Wall, M.L. Senior, S.E. Mitchell, S. Kresovich, and J. Ziegle. 1997. An evaluation of the utility of SSR loci as molecular markers in maize (Zea mays L.): comparisons with data from RFLPS and pedigree. Theoretical and Applied Genetics 95:163-173.        [ Links ]

Received: 30 May 2009. Accepted 13 July 2009.

Corresponding author:

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License