Servicios Personalizados
Revista
Articulo
Indicadores
Links relacionados
- Citado por Google
- Similares en SciELO
- Similares en Google
Compartir
Biological Research
versión impresa ISSN 0716-9760
Biol. Res. vol.46 no.2 Santiago 2013
http://dx.doi.org/10.4067/S0716-97602013000200001
REVIEW ARTICLE
Foundational errors in the Neutral and Nearly-Neutral theories of evolution in relation to the Synthetic Theory. Is a new evolutionary paradigm necessary?
Carlos Y Valenzuela
Programa de Genética Humana, ICBM, Facultad de Medicina, Universidad de Chile, Independencia 1027, Santiago, Chile.
ABSTRACT
The Neutral Theory of Evolution (NTE) proposes mutation and random genetic drift as the most important evolutionary factors. The most conspicuous feature of evolution is the genomic stability during paleontological eras and lack of variation among taxa; 98% or more of nucleotide sites are monomorphic within a species. NTE explains this homology by random fixation of neutral bases and negative selection (purifying selection) that does not contribute either to evolution or polymorphisms. Purifying selection is insufficient to account for this evolutionary feature and the Nearly-Neutral Theory of Evolution (N-NTE) included negative selection with coefficients as low as mutation rate. These NTE and N-NTE propositions are thermodynamically (tendency to random distributions, second law), biotically (recurrent mutation), logically and mathematically (resilient equilibria instead of fixation by drift) untenable. Recurrent forward and backward mutation and random fluctuations of base frequencies alone in a site make life organization and fixations impossible. Drift is not a directional evolutionary factor, but a directional tendency of matter-energy processes (second law) which threatens the biotic organization. Drift cannot drive evolution. In a site, the mutation rates among bases and selection coefficients determine the resilient equilibrium frequency of bases that genetic drift cannot change. The expected neutral random interaction among nucleotides is zero; however, huge interactions and periodicities were found between bases of dinucleotides separated by 1, 2... and more than 1,000 sites. Every base is co-adapted with the whole genome. Neutralists found that neutral evolution is independent of population size (N); thus neutral evolution should be independent of drift, because drift effect is dependent upon N. Also, chromosome size and shape as well as protein size are far from random.
Key words: Fixation, loss, neutral evolution, polymorphism, randomness, selectiveness.
INTRODUCTION
The AIM of the article
This article reviews critically the foundations of the Neutral Theory of Evolution (NTE) and the Nearly Neutral Theory of Evolution (N-NTE) in relation to the foundations of the Synthetic Theory of Evolution (STE). It shows that randomness is mostly incompatible with life. It also put the Wrightian evolutionary model, described in continuous math, into discrete math. The difficulties of the NTE and N-NTE are tested with selective interactions and periodicities found in dinucleotides whose bases are separated by 1 to K sites. Some new evolutionary facts and ideas are examined searching for the needs of changes in the STE.
The core of NTE and N-NTE
A summary of the foundations of NTE is given as the most important neutralist proposed. From the beginning an emotional bias with drift was present. Kimura (1991a, b) described genetic drift as he understood it from Wright (1931), and the foundations of NTE, "The late Professor Sewall Wright was my idol when I was young... I read Wright's 1931 classic... and his subsequent papers on random genetic drift... These papers impressed me deeply and, in fact, inspired me to become a theoretical population geneticist . According to the neutral theory, the great majority of evolutionary mutant substitutions at the molecular level are caused by random fixation, through sampling drift of selectively neutral (i.e., selectively equivalent) mutants under continued mutation pressure..." These ideas are reinforced in his last article (Kimura, 1993; he was born on 13^{th} Nov 1924 and died on 13^{th} Nov 1994; Crow, 1995), "At the time when I proposed the theory, the field of evolutionary genetics was dominated by the neo-Darwinian . synthetic theory of evolution . which represents a pan selectionistic view that evolutionary mutant substitutions . are almost exclusively caused by positive natural selection. In sharp contrast, the neutral theory claims . that the great majority of evolutionary changes, particularly at the molecular level, are caused not by Darwinian natural selection acting on advantageous mutants, but by random fixation of selectively neutral (i.e., selectively equivalent) mutants." In both articles he proposed the division of mutants into neutral and lethal ".let us consider the cumulative process in which neutral mutants are substituted sequentially at a given locus or site through random genetic drift under continued input of new mutants. Then we have for the rate of evolution per generation the formula. and v_{0} is the rate of production of neutral mutants per locus (or site) per generation. This formula is based on the well known property.that, for neutral mutations, the long term rate of mutant substitution is equal to the mutation rate... if ƒ_{0} is the fraction of neutral mutations at the time of occurrence. Advantageous mutations may occur, but the neutral theory assumes that they are so rare that they may be neglected in our quantitative consideration. Thus (1-ƒ_{0}) represents the fraction of definitely deleterious mutants that are eliminated from the population without contributing to either evolution or polymorphism, even though the selective disadvantages involved may be very small in the ordinary sense.The above formulation . in that the evolutionary rate (on the long term basis) is independent of population size and environmental conditions.". Neutralists proposed a general theory for organic evolution ("the great majority of evolutionary changes"), but they found (they thought to find) support for it only in molecular traits ("particularly at the molecular level"). The dichotomy of phenotype and molecular level was a problem for neutralists "Darwinian selection acts mainly on phenotypes shaped by the activity of many genes. Environmental conditions surely play a decisive role in determining what phenotypes are selected for; Darwinian, or positive, selection cares little how those phenotypes are determined by genotypes. The laws governing molecular evolution are clearly different from those governing phenotypic evolution" (Kimura, 1979). This vision was unaware of the contemporaneous advances in developmental biology that could not be overlooked. The subject continued open. Ohta (2003) devoted a section of her article: "4. Reconciling morphological evolution with the nearly neutral theory... It has been said that under the neutral theory, molecular evolution and morphological evolution are dichotomous, i.e. the former occurs by random drift with almost uniform rate and the latter, by natural selection depending on environmental changes. However, genes should be responsible for morphological changes. How can we reconcile such a dichotomy?" This article intends to give the bases to solve this.
Some necessary previous definitions and precisions
We assume that forward and backward recurrent mutations occur continuously. N is the generic population size. Substitution or replacement is a continuous process by which a mutant (allele or base) and its copies reach the frequency 1 (monomorphism, a replacement of previous alleles or bases in the locus or site). Elimination is a continuous process by which alleles or bases reach the frequency 0. Fixation is the process by which an allele or base remains with frequency 1 (at the locus or site) during an undefined (or infinite) number of generations. Loss is the process by which an allele or base remains at frequency 0 during an undefined (infinite) number of generations. The definition of the number of generations is given in the context of the study. With these definitions permanent fixations or eliminations are impossible under recurrent mutation; substitutions are antithetical to fixations. Genetic drift is an idea or conceptualization applied to the random fluctuations (physical processes) of genetic frequencies up or down with the same probability at any generation. The probability of a fluctuation increases as N increases (there is a widespread erroneous belief that assumes the converse statement to be true), but the magnitude of a random fluctuation increases as N decreases. If p is an allele frequency, its random sampling standard error is /{[p(1-p)]/N}; the amplitude of fluctuations is inversely proportional to the square root of N.
We denote the number of generations by G. Absolute fitness is the number of descendants an individual yields. A neutral fitness is then 1. A positive or advantageous fitness is greater than 1; a negative or disadvantageous fitness is less than 1. Absolute selection coefficients are defined similarly by the difference from 1. A neutral selection coefficient is 0, a positive coefficient of selection is >0 and a negative selection coefficient is <0. Positive selection coefficients may be any number above 0; negative selection coefficients cannot go below -1. Relative fitness is expressed dividing by the highest fitness; relative selection coefficients are obtained by the differences in relative fitness.
The most conspicuous traits of NTE
1) Fixations, eliminations or loss of alleles in loci or bases in nucleotide sites and the maintenance of polymorphisms are caused mainly by mutation-(genetic drift) processes and not by the mutation-selection mechanism proposed by the STE. 2) A small fraction (neutralists did never propose a quantity for small) of fixated loci or sites are due to the permanence of a neutral allele or base and the loss of the other alleles or bases by purifying selection (lethal and sub-lethal). As this small fraction was insufficient, neutralists added alleles or bases with negative selection coefficients to increase the fraction (1-f_{0}). The initial proposal that most of fixated alleles or bases were produced by the cumulative process of fixation (confused with substitution) of neutral alleles or bases by genetic drift, was changed as expressed by Ayala (2000) "a large proportion of all possible mutants are deleterious, but these are eliminated or kept at very low frequencies by natural selection with little or no consequence on the rates of molecular evolution". 3) Advantageous (positively selected) mutation may occur in a very low frequency (how low?) and this produces a small proportion of positively selected fixated alleles or bases. 4) As a corollary of 1), neutral fixations (they are substitutions) occur randomly in time; thus a molecular clock with stochastically constant rate of evolution is expected for the acquisition of fixations in phylogeny. 5) Deleterious or highly negative selective mutants do not contribute to evolution or polymorphisms. 6) Neutral evolution is independent of population size.
REFUTATIONS OF THE FOUNDATIONAL ELEMENTS OF NTE AND N-NTE
Errors in the foundational articles of the NTE
Two articles are recognized as the foundations of the NTE (Kimura, 1968; King and Jukes, 1969). King and Jukes (1969) stated "Fixation of Selectively Neutral Isoalleles. Drift is slow but effective in the fixation of neutral mutations." If we replace drift by the real underlying process: -random fluctuations of genetic frequencies are slow but effective in the fixation of neutral mutations- the sentence becomes unintelligible. Random fluctuations occur up or down around a frequency; they cannot be fast or slow, they are large or small (inversely to √N). Fluctuations do not lead only to fixation (really substitution), but to 1) "fixation" or "loss" if there is no recurrent mutation, 2) a permanent expected resilient polymorphism with recurrent mutation. Choosing arbitrarily only -fixation without recurrent mutation- introduced the bias that neutral fixation was possible by "drift" and created a circular epistemic procedure from where the following analyses cannot go out, until now. They continued ". As pointed out by Kimura (4) (Kimura, 1968) . the rate of random fixations of neutral mutations in evolution (per species per generation) is equal to the rate of occurrence of neutral mutations (per gamete per generations)." Kimura (1968) did not deal with the rate of random fixations, but the rate of nucleotide substitutions or replacement (which is the correct term); it is a misquotation of King and Jukes (1969). Unfortunately, Kimura (1968) introduced the term "probability of fixation" instead of "probability of substitution" and created a confounding error between substitution and fixation that remains until now. King and Jukes (1969) continue "of the 2N copies of a gene in a population of N individuals . only one is destined to become the ancestor (through replications) of all copies of the gene that will be in existence in the species in the distant evolutionary future. The process by which one line becomes fixed has been called "genetic drift"..." The reader see that this is not true; under drift any "destiny" is possible: fixation, loss or polymorphism; defining drift (random fluctuations) as the process that inexorably lead to fixation (which is impossible) was a regrettable error, destiny is not a scientific word; "... if a newly occurring mutation is selectively neutral, its probability of becoming fixed through random drift is 1/2N . if m_{i} is the rate of occurrence of selectively neutral mutations per functional gamete, the expected number of newly occurring neutral isoalleles in the species is 2Nm_{i} per generation.the rate of occurrence of neutral isoalleles destined to become so fixed is 1/2N X 2Nm_{i} or m_{i} per generation. the rate of non-Darwinian evolutionary change is a function only of the rate of occurrence of neutral mutations and is independent of population size". We confirm and extend the previous errors and add others: 1) the confusion between substitution and fixation invalidates the analysis; 1/2N is the probability of substitution, the expected probability of fixation is always 0. As defined here, substitution is a process by which the descendants of a mutant replace all the other bases in a site and reach the frequency 1 (100%). Since mutants are continuously being produced in the locus or site by forward and backward mutations, there is a continuous process of replacement of alleles or bases, respectively. The mutation rate is a turnover rate (fixation is physically impossible), so the substitution rate is also a turnover rate. Kimura (1968) and King and Jukes (1969) did not include in the analyses the source of the original mutant, so it appears in the population coming from nowhere. When a mutation occurs in one base in a site, if the base was fixed (100%), mutation destroys its fixation; thus m is always the rate of losing fixation. Fixation is the process of maintenance of a base at frequency 1 during several generations; it is a static (in NTE but not in STE) process, in contradiction to substitution that is a dynamic turnover process. The probability for a neutral mutant (in the context of neutral alleles or bases) and its copies to reach frequency 1 (a substitution or replacement) is 1/2N (N is the number of diploid individuals); but this is not the probability of fixation, that requires the number of generations (G) during which it is fixated to be calculated [P(fixation)_{G} = (1-m)^{G} that tends to 0 as G increases]. 2) The lack of dimensional analysis. Kimura (1968) mentioned the dimension of m but King and Jukes (1969) referred to m only as the rate of occurrence of neutral mutants per functional allele. Mutation is quite different from substitution. For a mutant and its copies to reach frequency 1 several population events occur. The neutral mutation rate m is equal to the neutral substitution rate only in magnitude but not in dimensions. The magnitude of m is mutant/site/generation. The magnitude of the probability for a mutant reaching frequency 1 is substitution/mutant; thus the multiplication is [1/2N]sub/mut times m(mutant)/ site/G whose product is m(sub/site/G) (here m is the value without dimension). Fixations are different because they have other dimensions. Fixations cannot be expressed by generation because the number of generations during which they remain fixed is unknown; their dimension is fix/site/(set of data).
3) The restriction to the destiny of a mutant allele or base and not to the dynamic behavior of the site. The physical destiny of a mutant and its copies is, inexorably, its extinction by recurrent mutation as we showed (Valenzuela and Santos, 1996), thus the probability of fixation (forever) of a mutant is always zero. The number of generations for the decrease of a fixed base to a frequency of 2% is 4/m; thus the idea that a mutant increases its frequency only by replication and random drift is erroneous because in each generation its frequency is decreased by m due to mutation. Neutralists never included the recurrent backward and forward mutation rate together and synchronously with drift in their models. The destiny of a mutant base in a site is threefold: 1) substitution or frequency 1; 2) elimination, loss or frequency 0; and 3) polymorphism with frequency 0<freq<1. Kimura's (1968) and King and Jukes' (1969) analyses are equally applicable to elimination and polymorphism. The probability for a mutant base (among the other three bases) in the case of neutrality to be lost or reach a polymorphic state in G_{i} is also 1/2N. The probability for a mutant to reach any frequency (includeing1 or 0) at generation G is 1/2N. These authors forgot that they are working with the probability of an individual neutral base or allele to reach any frequency at the G^{th} generation; this probability is 1/2N because the other (2N-1) neutral bases or alleles in the population should have the same probability. 4) The logical inconsistency between drift and population size independence. Neutralists demonstrated that the rate of neutral substitution (evolution according to them) was equal (in magnitude) to the mutation rate (m), and independent of population size. The "effect" of "genetic drift" (a physical agent in the NTE and a gnosic element in the STE) is a function of N (as we showed). By logical transitivity neutral evolution is independent of genetic drift. It is difficult to understand why neutralists have not concluded this last consequence when the logic (transitivity laws) is completely consistent. If X (drift) is a function of Y (N), and Z (evolution) is not a function of Y, then Z cannot be a function of X. We have shown that evolution whether neutral or selective is independent of population size (Valenzuela and Santos, 1996; Valenzuela, 2000, 2002, 2007), because given the matrix of base mutation (Nei, 1987; Valenzuela et al., 2010) and selection coefficients the population moves towards a resilient equilibrium that random fluctuations (drift) cannot change.
We have stated that neutralists did never advance numbers for lethal, sub-lethal, mildly-negatively selected genes and positively selected genes, and for the magnitude of selection coefficients. In science values should be provided in advance and not after observing data and assuming some errors such as the neutral fixation of genes by drift, otherwise an epistemic circularity is produced from which it is not possible to escape. King and Jukes (1969) and others authors have proposed some values by accepting neutral fixation and data from the literature "it appears that very slightly" (how slightly?) "deleterious mutations are some ten times as frequent as recessive lethals; . thus it would appear . of the order of 80 or 90 percent of spontaneous mutations are mildly deleterious, 5 to 10% are lethal, and 5 to 10 percent are selectively neutral." Let us assume that mildly deleterious mutations imply a selection coefficient of 0.1 (10%) and examine the human genome with 3x10^{9} bases, 99.5% monomorphic sites and a mutation rate of 10^{-8}m/site/cell generation (Kimura, 1968). In any cell generation 30 mutants are produced; 1.5 to 3 are lethal, 1.5 to 3 are neutral and 24 to 27 are mildly deleterious (the total effect is 0.1x24 = 2.4 accumulative deleterious effect); thus human life (even cell life) is impossible, due to point mutations; the effect of chromosome mutations is even worse.
King and Jukes (1969) accepted that amino acids (aa) with a critical function in a protein could be submitted to purifying selection. They found by examining 5,492 aa from 53 vertebrate polypeptides that only arginine appeared in a significantly lower observed than expected frequency. They did not perform a statistical exact test for all the aa with the standard error either in the regression line or in the comparison of observed and expected frequencies. Table 1 presents Table 6 of King and Jukes (1969) corrected by the standard error of estimate in a z two tailed test of proportions (5% significance level is obtained with z = 1.96). There are 10 among 20 aa whose proportions significantly deviated from the expected random distribution, not one (arginine) as the authors presented. The data of King and Jukes (1969) shows selective evolution of aa, not neutral evolution as has been always believed.
Refutation of neutral (ƒ_{0} and non-neutral (1-ƒ_{0})) fractions of mutants and direct testing of randomness
Definitions
In the following, we are going to work only with sites as loci and bases as alleles in a site (A, T, G and C, with respective frequencies in the site fA, fT, fG and fC) to avoid an infinite number of alleles. The biotic model is a haploid prokaryote organism with asexual reproduction; thus diploidy and homo or heterozygocity are absent; its genome has 5,000,000bp. The generic forward or backward mutation rates are denoted by m (u = forward; v = backward; we begin with m = u = v = 10^{-6} mutants/site/generation); different values of u and v will be indicated in the context.
The neutral fraction f_{0} and lethal fraction (1-f_{0}) are inconsistent with NTE and N-NTE Four bases A, T, G and C may be present in a site. Let us denote the absolute fitness by W_{A}, W_{T}, W_{g} and W_{c} (see definitions in Introduction) and the absolute selection coefficients by S_{A}, S_{T}, S_{G} and S_{C}. We see that the condition for the four bases to be neutral (S = 0) is a particular case of the general condition of being selectively equivalent. NTE and N-NTE deal with the two as synonymous and this is an error in the perspective of evolution. If the four selection coefficients are <1 ( S-) life is impossible, at any generation the population is reduced, regardless of the genome composition. If the four Ss are neutral (S=0), life is also impossible because any accidental population reduction cannot be compensated and the population cannot recover its N. Thus life is only possible if one W is positive, and highly positive to counterbalance accidental reductions, neutrality or negative fitness of the other 3 bases. Thus life is possible, if and only if at least one base in the site is sufficiently advantageous to compensate the other possible negative or neutral fitness. But this is the main proposition of the STE. In a genome level context, bases and better sites behave in an integrated and coordinated fitness. The most important value is the average fitness of the genome. If the four bases have a positive fitness or are selectively advantageous, the population is growing and the resilient polymorphism of the four bases cannot disappear from the population, because the four bases are transforming (by mutation) into each another. If the four bases have positive fitness and are selectively equivalent an equal (with small differences due to the mutational matrix) and resilient polymorphism of the four bases occurs, and not the expected monomorphism predicted by the NTE and the N-NTE.
With this in mind we examine some features of known genomes. Within a species 98% of sites have one base (X) and are missing (this is not exactly true and it is different for the STE, NTE or N-NTE, but let us accept this for the moment) the other three bases (denoted as Y). The NTE assumes that a fraction f_{0} of X corresponds to cumulative fixations by drift and a fraction (1-f_{0}) of X to fixated bases by purifying selection. Unfortunately, neutralists did not propose a value for f_{0}. What are the expected base frequencies at any site if they are selectively equivalent? The direct answer is a resilient polymorphism with ¼ A, ¼ T, ¼ G and ¼ C. This was demonstrated by Jukes and Cantor (1969) (cited by Li, 1997), with continuous mathematics and by Valenzuela and Santos (1996) with discrete mathematics; this last demonstration shows also that this equilibrium is resilient; fixation or loss of a base is impossible and any mutant and its copies are ephemeral. The demonstration assumes that mutation rates are equal (one parameter); a two parameter model (transversions different from transitions) lead to the same conclusion. More realistic models with 6 parameters due to complementariness of bases lead to different equilibrium frequencies with fA = fT and fG = fC (Sueoka, 1995; Valenzuela, 1997). The equilibrium frequencies may be obtained by the mutation matrix of a base into the others (Nei, 1987; Valenzuela et al., 2010); however; these frequencies are probably not neutral frequencies, because mutation rates are submitted to selection (Baer et al. 2007). Drift cannot change the equilibrium frequencies given by recurrent forward and backward mutation. Drift (as randomness) is not a physical factor but a concept. The process underlying drift is the fluctuation of genetic frequencies due to a great deal of unknown processes that can move frequencies up or down, but their net expected effect is constitutionally zero (as in Brownian motion). For the STE random fluctuations (drift) were always a non-directional "factor" (they occur up or down with equal probability) compared to the directionality of mutation (from A to T, G or C) and selection (negative, positive or neutral); we saw that neutralists introduced a great confusion and a subsequent bias of all the studies until now, with the terminology of "mutants destined to fixation" (Kimura, 1968; King and Jukes, 1969). As we demonstrated in our previous articles (Valenzuela and Santos, 1996; Valenzuela, 2000) the only destiny of a mutant and its copies is their extinction. The demonstration is the same for the impossibility of fixation and loss. Briefly let as assume that A is fixed (fA = 1), in the next generation the expected fA is (1-m), in the second generation after fixation it is (1-m)^{2} in the n_{th} generation it is (1-m)^{n}; since 1>(1-m)>0, (1-m)^{n} tends to 0 as n increases. Random frequency fluctuations (drift) always operate in the way to and at the equilibrium moving base frequencies up or down without changing their expected value. We see the big error when neutralists assumed that f_{0} is the fraction of "fixed neutral bases by drift". The only expected situation for the fraction of neutral bases is a resilient polymorphism of the four with similar frequencies. Thus f_{0} is the expected fraction of the genome that should be found polymorphic (Wright, 1931 to 1969, Jacquard, 1970, Valenzuela and Santos, 1996, Valenzuela. 1997; Valenzuela et al., 2010).
Unaware of this error, neutralists wanted to explain the great proportion of monomorphic sites (98% or more within a species) by assuming lethal bases in a purifying selection process occurring with frequency (1-f_{0}). They forgot the dialectical reality that in a site it is not possible to define a lethal base without defining (to maintain life) one or more advantageous bases. Since we found only one of the four bases with frequency 1 (fixed according to neutralists) in most sites, this means that the other three are lethal or in some manner "impure", and they are eliminated from the site (purifying process). If this is so, the remaining base should be highly advantageous to maintain the individual and the taxon or deme alive. Thus the fraction (1-f_{0}) is also the fraction of advantageous bases. The addition of one neutral base among the four bases does not change anything in the picture. Thus, the proposition of "purifying selection" was so effective as to disqualify (purified evolution from) NTE and N-NTE. Neutralists and nearly-neutralists realized this inconsistency and thought of bases negatively selected but with small selection coefficients. In fact N-NTE added selection coefficients similar to the mutation rate (m) or to the inverse of N (Ohta, 1992, 2002; Nei, 2005); this last proposition resulted inconsistent with changes of N [models fluctuate between STE, NTE and N-NTE (Nei, 2005)]. Again, if low selection coefficients near 0.01 (this is equivalent to 1 lethal mutant in 100 neutral mutants) are proposed for three bases in the site and the remaining base is neutral, it is sufficient to collect 400 mutant sites to make life or reproduction impossible. As we mentioned the human genome has near 3x10^{9} sites, with a mutation rate of 10^{-8} mutants/site/cell-cycle we have 30 mutants in any cell-cycle; 46 cell-cycles to reach reproductive maturation (Kimura 1968) yields 1,380 mutants per gamete: life is impossible. Moreover, low coefficients of selection do not produce in the site a monomorphism; as their value approaches 0 the expected situation in a site is no more a monomorphism but a polymorphism with the four bases (see below in the discrete Wright model). Thus the addition of very low coefficients of selection also destroys NTE and N-NTE. In fact when three bases have negative selection coefficients similar to the mutation rate (or are neutral) and the remaining base is neutral (or has a positive selection coefficient similar to m) we obtain that the relative positively selected base attains the resilient equilibrium when its frequency is near 0.43 and the other three bases have frequency 0.19 (See Appendix 1, the corresponding table and the next sections).
The STE accounts for all these facts in a simple model. The four base frequencies are maintained in resilient equilibria in the site by the recurrent forward and backward mutation and selection dynamic process. The resilient equilibrium given by selection and mutation (see Table of the Appendix 1) cannot be altered by drift that is responsible for differences due to sampling; but if sampling produces a situation far from this equilibrium, the resilience of the system will return it to the equilibrium frequencies. For negative S>>m, the frequency of the positively selected base at equilibrium is, in an approximate linear solution, (1-m/s), a result demonstrated from the early development of the STE, and the frequency of the other three bases is m/s. This accounts for the 98% monomorphic sites and for the 2% of polymorphic sites. As S approaches m the negatively selected bases increase their frequencies. If m = 10^{-6} and S is 10^{-5}, the least frequent bases are found with frequency near 0.1. Lower Ss need the quadratic solution (Appendix 1). Thus if f_{0} is large, the expected fraction of polymorphic sites is large; if (1-f_{0}) is large life is impossible. The STE solves all these problems; it proposes that the genomes in a population or taxon are maintained in coordinated resilient equilibrium of the four bases at all the sites in the particular environment where this population lives. The most frequent base is found with probability (1-m) and the least frequent with probability (m). For the STE the availability of the four bases at any site, maintained in a resilient mutation-selection dynamics, confers the maximum possibility of adaptation to genome or environment changes. No base is fixed or lost forever, all the bases are needed for adaptation; the loss of only one of them is the loss of 1/4 of the availability for adaptation. There is always at least one base positively selected at each site. This offers the maximum possible adaptive spectrum (4^{K} possibilities, K is the number of the genome nucleotide sites) of the population or taxon for genome or environmental changes. This co-adaptive collection of resilient equilibria of the four base frequencies in genome sites is the present molecular version of the adaptive peaks that Prof. Wright proposed (Wright, 1932, 1969, 1988) in the adaptive landscape of demes or taxa living in an environment. This picture is antithetical with that of mutants fixated by random fluctuations (drift) or maintained in the site by purifying selection that gets rid off "impure" bases and leaves the neutral "pure" base as the only base in the site. 98% monomorphic sites for any species refute conclusively the NTE and the N-NTE.
Refuting the random expectancy of the frequency of bases in a site.
As we showed, if the four bases are selectively equivalent their expected frequency at any site is ¼ A, ¼ T, ¼ G and ¼ C. If the mutation rates among bases are different the expectancy of base frequencies is not A but that given by the mutation matrix among bases (Nei, 1987; Sueoka, 1995; Valenzuela, 1997; Li, 1997). Neither the A nor the mutation matrix expected polymorphisms have ever been found, monomorphism of sites is the rule. Moreover, the observed polymorphisms of bases at polymorphic sites are neither A nor that of mutation matrix, and regardless the actual number of bases in the sites, their frequencies are not randomly distributed (Valenzuela et al., 2010). A sample of 1000 individuals of a bacterium species ubiquitously distributed in the Pacific Ocean, taking one bacterium every 50 kilometers to avoid reproductive relatedness, should present, according to the NTE and N-NTE, a definite polymorphism of the four bases with frequencies over 0.1 for each of them. This has never been found.
Neutralists changed the sense of randomness of genetic drift
As we mentioned, random fluctuations (drift) of genetic frequencies occur up or down with the same probability in one generation, thus its effect in millions of cell or individual generations on frequencies maintained by resilient mutation-selection equilibriums is zero. Prof. Wright (died in 1988) disagreed with the neutralist use of drift. He "complained bitterly that his views on the evolutionary role of genetic drift had been consistently misinterpreted" (Gould, 2002). Prof Kimura (1991) was conscious of this complaint "Wright, in his later years, used to claim that he had never attributed any significance to random drift except as an agent to bring about shift of adaptive peaks. however, Wright in his papers of the early 1930s used to attach much more weight to random drift..." The article of Wright (1931) shows that Kimura (1991) was wrong. Wright (1931) stated firmly that under forward and backward mutation fixation or elimination (as its complement) are impossible "If mutation is occurring, however low the rate, the decline in heterozygosis, following isolation of a relative small group from a large population, cannot go on indefinitely. There will come a time when the chance elimination of genes will be exactly balanced by new genes arising by mutation" and "It only requires a very moderate mutation rate in a large population for the number of unfixed loci to become enormous". The evolutionary effect of "drift", as Brownian motion, is zero. In the STE drift is not a directional evolutionary factor, it cannot qualitatively drive evolution, as Prof. Kimura and neutralists have pretended. Prof. Wright was indulgent; neutralists did not misinterpret his studies, they misunderstood them; they gave a directional sense to drift; they pretended that drift fixed and maintained fixation of neutral alleles or bases. The extreme classes (fixation and loss) for Wright (1931) are transient classes in a continuous turnover. For neutralists and n-neutralists "drift" a concept was converted into an agent or factor driving evolution. Randomness is a model (idea) and not a content (fact) of behavior.
Epistemic and ethical problems arise.
The main factor for evolution in the NTE, which is randomness, was changed by mild selection; by scientific honesty it was necessary to change the NTE to the Theory of Evolution by means of Mild Selection with a small contribution of randomness and lethality. Neutralists thought the problem could be solved by accepting very small negative coefficient of selection, near the magnitude of mutation rates; we showed that this leads to an expected polymorphism instead. The other solution was to assume that negative selection does not contribute to evolution or polymorphism, but this is an undemonstrated assumption that functions as another circular negative heuristic protective belt (negative selection does not alter NTE because it is not evolution and evolution is a process that cannot be altered by negative selection). We showed that in the STE all the bases are available for adaptation, regardless of their frequencies; the fitness of a base or allele is variable according to the residual genome and the environment; for the NTE and N-NTE fitness is a fixed value. The conservation of randomness or genetic drift plus very mild selection (similar to mutation rates) gave rise to the Nearly-Neutral Theory of Evolution (N-NTE) (Ohta, 1992, 2002; Nei, 2005).
We could finish here, because random fluctuations of genetic frequencies (drift) are not and cannot be the main "factor" of evolution (if it has any evolutionary role). We also saw that purifying selection or very mild negative selection destroy rather than save the NTE and the N-NTE, however, it is necessary to refute randomness at other levels of life organization and find the historical origin of errors.
REFUTING NON-RANDOMNESS AT OTHER BIOTIC LEVELS
The non-random chromosome structure
Chromosomes evolve, conserving the karyotype size and content, by chromosome rearrangements (ChR), mostly by binary ChRs (translocations and inversions). They also evolve by changing the karyotype size and content in transversal genome fusions, simbiogenesis, sexual processes, karyotype endo-multiplication, and other mechanisms (this subject is outside of the scope of the article, OSA). Here we shall see the first mechanism. In studies of the distribution and number of chromosome breakpoints (ChBp) in binary ChRs a direct and linear relationship with the chromosome length was assumed. This assumption is erroneous because in translocations between a small and a large chromosome the same number of ChBps should be found in both chromosomes; the same occurs in pericentric inversions of non-metacentric chromosomes. The solution was a quadratic form of the karyotype in agreement with the condition of binary ChRs (Valenzuela, 1979, 1985a). The following development answered the question on the expected chromosome length and the position of the centromere in evolution by mean of binary ChRs (Valenzuela, 1985a, Valenzuela and Lopez-Fenner, 1986; Gouet and Lopez-Fenner, 1985, 1986). If the total number of nucleotides of the genome is denoted by N, the invariant number of centromeres is K (chromosome number; 2K is the number of arms) and any internucleotide junction is affected in a ChR with the same probability; then the N nucleotides (undistinguishable balls) distribute in the K chromosomes (distinguishable boxes) according to a Bose-Einstein distribution (Feller, 1968, Valenzuela, 2009), with mean (expected chromosome length) N/K, and variance [N(K-1)(N+K)]/[K^{2}(K+1)] (Gouet and Lopez-Fenner, 1985; Valenzuela and Lopez-Fenner, 1986, Valenzuela 2009). The solution for a continuous distribution considering the karyotype length as 1 is more complex and not presented here (Gouet and Lopez-Fenner, 1986). With these two parameters we can test any karyotype. The human karyotype (http://en.wikipedia.org/wiki/Chromosome) has 24 chromosomes (X and Y considered apart) ranging between 46,944,323 bp (Ch 21) and 247,199,719 bp (chromosome 1). According to the random based Bose-Einstein statistics the expected mean and standard deviation for the human karyotype are N/K = 3,079,843,747 bp/24 = 128,326,822.8 bp and [N(K-1)(N+K)]/[K^{2}(K+1)]} = 123,086,764.9 bp, respectively; the standard error is 25,124,980.67 bp; the 95% confidence interval (CI), to test an individual chromosome is between 0 bp and 369,576,880.0 bp (due to the large variance of the Bose-Einstein distribution). Human chromosomes fall within the 95% CI. Perhaps they present a significant different variance instead; testing the variance is OSA, it requires obtaining the variance of the variance [Wright, 1969 (Vol. I, p 201); Spiegel 1980; Valenzuela, 2009]. The expected position of the centromere was also estimated. If every internucleotide junction participates equiprobably in a ChR the centromere position in a chromosome follows a uniform distribution within the short arm [chromosomes are classified with the short arm up (arm p) and the long arm down (arm q)]. So the random position of the centromere in a short arm be calculated [expected number of nucleotides = N/(2K)] if its position occurs with the same probability at any internucleotide junction of an expected short arm. According to the uniform distribution (Spiegel, 1980), this position is at half of the arm, that is A at the top of the chromosome or 0.75 of the total chromosome with variance (1/2)^{2}/12 = 1/48 [(1/2)^{2} is the square of the size of the short arm)], the standard deviation is 0.1443 and the 95% CI for a chromosome in the uniform distribution is from 0.525 to 0.975 of the total length of the chromosome. Chromosomes are classified by the proportion of the long arm to the total chromosome in: metacentric-submetacentric from 0.5 - 0.749, acrocentric (or subtelocentric) from 0.75 - 0.919 and telocentric if they have only the long arm (0.92 to 1.0, the short arm is undistinguishable with current techniques) (http://homepages.uel.ac.uk/V.K.Sieber/human.htm). Human chromosomes should be mostly metacentric-submetacentric and acrocentric as they are, but the variance of the centromere position is probably far from randomness because of their penta-modal distribution [0 telocentric, 7 acrocentric (13, 14, 15, 18, 21, 22, Y between 0.77-0.87), 3 intermediate (4, 5, 12, between 0.72-0.74), 10 submetacentric (2, 6, 7, 8, 9, 10, 11, 16, 17, X between 0.60 and 0.70) and 4 metacentric chromosomes (1, 3, 19, 20 between 0.52 and 0.58) (test OSA)]. If we divide the short arm into 4 equal segments (0.500-0.625; 0.626-0.750, 0.751-0.875 and 0.876-1.0) by the centromere position, we expect for these segments (6, 6, 6, 6) chromosomes and we have (9, 8, 7 and 0), respectively; P = 0.0396. The karyotype of the mouse (Mus musculus) (http://www.oxfordjournals.org/our_journals/jhered/freepdf/63-69.pdf) is significantly outside this random expectancy; it has 21 almost equal (size) telocentric chromosomes (including the X and Y chromosomes); the expected number of this chromosome class is the probability of a telocentric chromosome (0.09) times the number of chromosomes (0.09x 21) =1.89; the total probability is P = 0.09^{21} = 1.1x10^{-22}. The mouse karyotype is impossible under neutral or random evolution by random chromosome rearrangements (the different karyotypes of subspecies of the mouse go from 2n = 40 to 2n = 22 with 10 metacentric chromosomes obtained by Robertsonian translocations, Vasco C et al. 2012; testing these karyotypes is OSA). These tests of evolutionary randomness of karyotypes that resulted highly significant for chromosome shape in mice (and less significant in humans) refuting neutral evolution are not the most important tests for this aim. The definitive and conclusive refutation of NTE is the maintenance of these karyotypes during tens of millions of generations of cells or individuals. In the case of humans or anthropoidea with a very similar karyotype, the core of these karyotypes has been maintained during more than 15 million years (Mys). In the individual or in culture cells the karyotype is unstable; cytogeneticists know that normal cells may produce some small percentage of ChRs in one cell cycle; and cancer cells with ChRs are produced in individuals during their whole life. The karyotype stability along with evolution of taxa, in spite of its individual instability, can only be produced by a pan-selective process.
The non-random structure and length of proteins
At first, purifying selection was assumed for aa in the functional site of a protein. Soon it was assumed for the allosteric site, signal peptides, sorting sequences, integration to membrane, interaction with the cytosol and the external cell environment. Besides that, any amino acid has critical contributions to the secondary, tertiary and quaternary structure of proteins, and to their hydrophilic, hydrophobic, lipophilic, lipophobic and acid or basic properties. Now every aa has several critical functions or structural actions. They are all adaptive to the whole organization of the living being, with higher or lower critical values. These critical functions are directly related to the fitness and selection coefficient an individual has. Moreover they are all integrated in the whole organism, making it difficult or impossible to dissociate the isolated critical contribution from the complete individual fitness.
Protein length is also selected. What are the random expected protein length and its variance? We answered this question by considering the random distribution of codons in a genome segment (Valenzuela and Santos, 1996). Assuming that 64 codons are equally probable in a DNA segment, the 3 termination codons (tc) should occur randomly among the 64 (P = 3/64 = 0.0469 = p); the 61 aa codons (non-terminal codons = ntc) should occur with P = 61/64 = 0.9531 = q (p+q = 1). The geometrical distribution describes the probability of a series P (ntc_{1}-ntc_{2}-ntc_{3}. tc) = q*q*q*.p. Its expected mean is 1/p = 21.3 (aa), with variance q/p^{2} = 61x64/9 = 433.78 and standard deviation = 20.83 (aa). The 95% confidence interval for testing the length of an isolated protein goes from 1 to 62.96 aa. Most proteins have lengths greater than this number. The neutral probability to find a protein with 100 aa is P = q^{100}p = 0.000385; for 500 aa it is P = q^{500}p = 1.76x10^{-12}. There are a large number of proteins with more than 500 aa. The average size (length) of 104,394 eukarya proteins was 361 aa, a result far from randomness (z = 5,269.2; P < 10^{-100}); similar significant figures are found for pro and eukaryotes (Brocchieri and Karlin, 2005). Most of the averages of aa for all proteins are over 200 (P = 3.2x10^{-6}) either for eu- or prokaryotes. It is evident that the length of proteins is a non-random biotic character and termination codons were and are selectively avoided in evolution to construct long proteins. It is also evident that longer proteins had to be acquired and maintained by positive selection to replace shorter ones.
THE ORIGIN OF THE MOST IMPORTANT ERRORS IN NTE AND N-NTE
We saw that the most important conceptual error in NTE and N-NTE was having forgotten that random genetic drift is a non-directional factor in evolution; consequently they attributed constructive properties to random fluctuations of genetic frequencies through cumulative fixation of neutral bases. Mutation and genetic drift could transitorily construct some DNA segments with hermeneutical (meaning sense, Valenzuela, 2009, 2011) value, but, they cannot maintain them. Drift cannot drive evolution. Brownian motion cannot construct and maintain anything, except the resilient random Brownian equilibrium that is antiethical with life. A second error, which is the base for the first, was forgetting (or worse, to exclude ideologically as we shall see) recurrent forward and backward mutation that necessarily occur synchronously with random genetic drift, making fixation impossible. A third error was not considering that (physical and not mathematical) equilibrium conditions generated by mutation and selection are resilient physical systems; drift (an idea) is powerless to change the equilibrium. A fourth was to confuse substitution and fixation. A fifth error was to work with the fate of an allele (it is ephemeral, its fate is its extinction) and not with locus dynamics. These errors are the subject of the next sections, from a more formal perspective. Table 2 presents a comparison between STE and NTE + N-NTE.
TABLE 2 The most important conceptual and foundational differences between the STE and the NTE plus N-NTE |
K = number of sites in the species genome. J = number of polymorphic sites |
Mutation, selection and drift together
An epistemological caution. We need a caution on the application of mathematics to biotic processes. Working with different human genetic pools where each one was in Hardy-Weinberg equilibrium (HWE), we tested the fusion of two of these groups to search for a deviation from HWE. The theory demonstrated that such a fusion should yield a sample far from the equilibrium. Our surprise was that this hybrid sample was as much in HWE as each of the two samples (Valenzuela and Harb, 1977). Our further study discovered an irremediable epistemic and algebraic error coming from the application of math on biotic realities (Valenzuela, 1985b). A more accurate treatment confirmed the error and indicated its causes (Valenzuela, 1994). In short (a detailed study is OSA) when we apply math to nature we assume that all the axioms of math are valid for nature. This assumption is often (if not always) false. Nature, in this case gene frequencies and population sizes are not continuous; they do not move on real numbers but on rational numbers, so discrete and not continuous math must be applied. Current calculus or differential equations cannot be applied without errors of unknown effects. Nature is not linear in a wide sense; its dimension is very often fractal and treating it with integer dimensions is an error. Associativity, commutativity, distributivity and symmetry are not necessary properties of biotic process (efforts to create "genetic" algebras without these properties have been performed; OSA); biotic interactions (treated either as scientific or biological ones) are not mathematical or statistical interactions. Nature is almost always non-Gaussian, heteroscedastic and has qualitative conditions that can seldom be dealt with by math without errors. The most important restrictions happen in the interphases (biotic processes)-biology-mathematics. Putting scientific faith in math models as a condition of rigorousness and scientificity may be useful as a mental requirement, but it is dangerous when they betray reality. Resilient systems do exist as physical realities (ontic-oriented science) regardless of the construction of mathematical models (gnosic-oriented science) (Valenzuela, 2007). In the following we will use discrete math and computer simulations.
The pure mutational model. The impossibility of fixation and the resilient equilibrium with forward and backward mutation. Let us assume that A is monomorphic (100%) in a site and forward mutation to the other three bases occurs at the rate m (or u); backward mutation occurs from the three bases to A at the same rate, but m/3 (or v) from each base (T, G, C) to A (Valenzuela and Santos, 1996). Let us define an evolutionary cycle of mutations when the expectancy for a new base in site is 1; this occurs when G = 1/m (Kimura, 1979; Valenzuela and Santos, 1996; Valenzuela 2000a). The sequence of the frequency of A after each complete cycle of mutations is 1/1, 0/1, 1/3, 2/9, 7/27, 20/81 .If we denote the numerator by Nu and the denominator by De, it is easy to find that the recurrence formula is Nu_{(G)} = [De_{(G-1)} - Nu_{(G-1)}]. This occurs because fA_{G} contributes nothing to fA_{(G+1)} and from (1-fA_{G}) or f(T+G+C)_{G} (complement to fA_{G}) 1/3 is converted to fA_{(G+1)}. Thus fA_{(G+1)} = [0xfA_{G} + (1-fA_{G})/3]; fA_{(G}+_{1)} = (1/3-fA_{G}/3). If we initiate the series when fA_{1} = 1/3 (the third term of the series) we should have (reading from right to left): S = (3^{(G-1)} - 3^{(G-2)} + 3^{(G-3)} - 3^{(G-4)} ... ± 3^{0})/3^{G}. Denoting the numerator as S_{Nu} and multiplying by 3; 3(S_{Nu}) + S_{Nu} = (3^{(G)} - 3^{(G-1)} + 3^{(G-2)} - 3^{(G-3)} ... ± 3^{1}) + (3^{(G-1)} - 3^{(G-2)} + 3^{(G-3)} - 3^{(G-4)} ... ± 3^{0}); then 4S_{Nu} = 3^{G} ± 3^{0} = 3^{G} + (-1) ^{(G-1)}; then S_{Nu} = [3^{G} + (-1)^{(G-1)}]/4. The total is: S=[3^{G} + (-1) ^{(G-1)}]/(4x3^{G})] = A + [(-1)^{(G-1)}/3^{G}]/4 or A + (±1)/(4x3^{G}), where (+) holds for odds and (-) for even Gs. Thus, S tends to A as G increases. When G = 0 (an even number), S = A - A = 0; when G = (-1), S = A + A = 1.
We conclude that the equilibrium for any base with equal forward and backward recurrent mutation rates is A as was found by Jukes and Cantor (1969) [cited by Li (1997)] with continuous math (differential equations). This equilibrium is resilient; random fluctuations towards higher or lower frequencies generate counterbalancing forces to the equilibrium frequency. This is a well known result of population genetics (Wright, 1931, 1969; Jacquard, 1970; Li 1976; Nei, 1987). The novelty is a demonstration by discrete mathematics without errors due to continuity, and an emphasis on the impossibility (resilience) for random frequency fluctuations to change the tendency of the physical resilient equilibrium determined by forward and backward mutations.
However, m is unrealistic and it is necessary to consider the mutation matrix from one base to the others. This matrix was developed in Nei (1987) and used by Valenzuela et al. (2010). We know the use of models derived from this matrix in the "one parameter model of Jukes and Cantor", the two parameter model of Kimura (see Li, 1997); less known is the six parameter model of Sueoka (1995) and Valenzuela (1997). From a pure mathematical treatment where u and v are the forward and backward mutation rates, respectively, and x is the gene frequency, the (mathematical) equilibrium is found when x_{e} = u/(u+v) (Nei, 1987, p. 365). Nei (1987) judged "this two-allele model is now obsolete, since at the nucleotide or amino acid level most loci produce a large number of different alleles". This is an error because; 1) in a nucleotide site only 4 bases are possible and a two-allele model with A and no-A is a fullly realistic model, 2) the equilibrium between forward and backward mutations of bases in a site is a physical real process, that occurs regardless of the math analysis of it, 3) in the same textbook (Nei, 1987, p. 67-79) the analysis of the equilibrium of base frequencies, by the mutational matrix of bases in a site, is presented as the updated and proper math treatment. The following analyses are a reformulation of the Wrightian vision (Wright, 1931, 1969) in discrete math.
The destiny of one mutant. We work with a steady state population of bacteria of size N. At any generation the population duplicates to 2N individuals and N individuals die at random. We follow in a site the appearance of a mutant A among the other individuals with phenotype B (non-A). We are interested in the equality by descent. Let us considerer a population of 4 bacteria with five possible populations described by the number of A individuals (frequency of A = fA) that is 0, 1, 2, 3, 4A; fA in these populations is 0.0, 0.25,
0.50, 0.75, 1.0, respectively. In G_{0} the four bacteria are B, in G_{1} a mutant A appears. The model describes the dynamic random evolution of a mutant that in generation G_{n} reaches (by its copies or descendants) any of these frequencies. When fA reaches 1 we say there is an A monomorphism (a substitution). When fA = 0, there is a loss or monomorphism of non-A. When 1>fA>0 we say there is a polymorphism with different frequencies of A. Note that only in this special case of the dynamic evolution of a single mutant, without recurrent mutation, fixation and loss are possible and the inexorable and irreversible final destiny of the mutant and its copies; once fixed or lost the population cannot return to its polymorphic form. The dynamic analysis was performed by Feller (1951) and followed by Kimura (1957) using stochastic matrices to describe this kind of random walk of frequencies. The analysis consists in obtaining the eigenvectors, eigenvalues and their coefficients to find the spectral analysis and obtain the principal parameters of the processes (Jacquard, 1970, Crow and Kimura, 1970). These parameters are the equilibrium frequencies and the mean and variances of the number of generations to reach fixation, loss or maintenance of the polymorphism. The equilibrium frequencies with one mutant and genetic drift are obviously either loss or fixation. An historical event is interesting. Prof. Kimura began as a cytogeneticist (Crow, 1995) and he applied this study to the diffusion process of a mutant that appears in one filament of the chromosome, whose number of filaments was unknown, but was assumed to be 2, 4, 8 or 16, and diffused to the others. The model (hereafter, Model) is perfect for bacteria. We developed the matrices and spectral analysis from 2 to 8 bacteria and gave up. If I is the number of As in the population in G_{n}, the probability of having a population with J As in G_{(n+1)} is described by a quotient between the number of combinations of 2I over J times the number of combinations of (2N-2I) over (N-J) and the number of combinations of 2N over N (Feller,1951; Kimura, 1957). This probability is
The matrix corresponding to 4 bacteria is presented in Appendix 2. It is important to remark that when there is no A (loss of A) the population remains with B fixed forever {P[(J=0)/(I=0)] = 1}; when fA is 1 (100%) the population remains with A fixed forever {P[(J=4)/(I=4)] = 1}. These marginal states (fixation and loss) from where there is no return to the central states (polymorphism) in the math of stochastic processes are called absorbing states (barriers, conditions) (Feller, 1969). Appendix 1 shows also the multiplication of a column vector (0, 1, 0, 0, 0) which represents a population at G_{0} with 1A (fA = 0.25) by the matrix that yields a new vector (3/14, 8/14, 3/14, 0, 0). This means that the probability to find a population, at G_{1} when fA = 0 is 3/14, fA = 0.25 is 8/14, fA = 0.5 is 3/14, fA = 0.75 is 0 and fA = 1 is 0. We developed a computer program limited to a population with a maximum of 185 bacteria and compared the results of the Model with those obtained by neutralists for the average number of generations to reach fixation, loss and the maintenance of the polymorphic state. The program iterates the multiplication of the matrix by a vector until the difference between consecutive non-zero elements of the vector is less than 10^{-18}. If we adapt the formulas created for diploid organisms to bacteria and consider the effective number as the number of bacteria in the population; denote as x the initial frequency of the mutant, LG_{a} as the average (number of) Gs to loss, FG_{a} as the average G to fixation, and PG_{a} for the average G for a polymorphism to stay in the population, we have: LG_{a} = - [4N/(1-x)][xlog_{e} (x)] (Kimura and Ohta, 1969a, b); FGa = - (4N/x)[(1-x)log_{e}(1-x)] (Kimura and Ohta, 1969a); PG_{a} = - 4N[xlog_{e}x + (1-x)log_{e}(1-x)] (Kimura and Ohta, 1973). Table 3 presents the comparison of values calculated with the two methods when the original x is 1/N (one mutant A) and when x = ½. We worked with powers of 2 until 128, and named KaO the parameters calculated by Profs. M Kimura and T. Ohta. We see in the case of beginning with one mutant that our Model gives slightly smaller LG_{a}s and FG_{a}s than KaO, and it seems they approach one another asymptotically (this is very probably due to the difference between continuous and discrete models). However, estimates of PG_{a} differ greatly in both models, being smaller in the Model until 4 bacteria and greater thereafter. When the initial allele frequency is 0.5 estimates of LG_{a} and FG_{a} were again very similar in both models and disagreed in PG_{a} that was systematically lower in the Model. We observe an intriguing situation, in contradiction to the expectancy, which is the equality of LG_{a}, FG_{a} and PG_{a} in KaO. In the Model LG_{a} and FG_{a} are equal and greater than PG_{a} as expected. Turning to formulas we see that the calculation of KaO PG_{a} is a composite of equations for loss and fixation and tends to loss for small values of x and to loss or fixation for values of x approaching ½. This last equality is erroneous because LG_{a} and FG_{a} should necessarily be greater than PG_{a}, given that the loss of A or the fixation of A implies the extinction of the polymorphism. Also, the assimilation of PG to LG when x is low is an error, because the probability of extinction of a mutant A at the first generation increases to the limit A as N grows {limit of (N- 1) /[2(2N-1)]}; thus the polymorphism must be maintained during more generations because several copies of A (A) begin their way to fixation in more generations. When we examine the origin of formulas we discover that the authors used continuous math and integrated gene frequencies between 0 and 1; thus the classes 0 and 1 are included (as a limit); this disagrees with the Model that uses discrete math, and for the permanence of the polymorphism it uses the reduced stochastic matrix without the classes 0 and 1 for I and J (see Appendix 2) . The composition of formulas of loss and fixation to obtain a formula for polymorphism is invalid. It is necessary first to obtain the probability spaces (the sum of the proportion of the population that is polymorphic at power 0, 1, 2 .; in technical terms the probability space for the moments of the distribution) for the polymorphism (the inner value of frequencies excluding the 0 and 1 classes), and then calculate the probabilities. These probability spaces were never obtained for KaO or by neutralists until now, as far as we could find (Hartl and Clark, 2007). Moreover, this space cannot be obtained by continuous math without excluding 0 and 1; in one generation this is not a perceptible error, but this does not occur in an infinite number of generations for which all G_{a}s should be calculated. Other differences between the models may be due to the fact that frequencies in the Model move on rational numbers and frequencies of KaO move on real numbers (this is physically impossible, a difference seen by Wright, 1931). Another qualitative difference rises from the fact that all the previous models use the binomial distribution (with the standard error of the estimate) to calculate the variance of a frequency from G_{n} to G_{(n+1)}. The binomial distribution allows any gene frequency vary in the next generation from 0 to 1 (in a continuous interval); this is not possible in the stochastic matrices of this Model, in which only the frequency 0.5, and for even Ns, allows all the (discrete) values in the next generation (when fA = N/2). The stochastic matrix based on the binomial distribution (without the standard error) is presented in Appendix 2, together with the reduced matrices (without extinction states). Kimura (1957) developed this matrix method, but left it and searched for continuous methods developed to study stochastic processes such as Brownian motion. They are called diffusion processes, studied by diffusion equations, particularly by Kolmogorov who Prof. Kimura followed (Kimura 1957; Crow and Kimura, 1970). As we saw the fate or destiny of one mutant was the paradigm he and neutralists used. They chose biasedly random fixation, which is only valid for the evolution of a mutant without recurrent mutation, as the center of the evolutionary explanation. They consider synonymous the absorption states of the stochastic matrices with fixation and loss. Th ey never did include the simultaneous process of mutation by putting together the stochastic matrix of variation (just described) with the stochastic matrix of mutation. We searched unfruitfully for the inclusion of the mutation matrix in the model; it is absent in the literature. Thus, our next step was to develop this matrix and study the bacterial evolution with both matrices and with the selection vector to complete Wright's evolutionary discrete model.
TABLE 3 Average number of generations until loss (LG_{a}), fixation (FG_{a}) or polymorphism maintenance (PG_{a}) between the Model and Kimura andOhta's (KaO) models |
N = number of bacteria |
Equilibrium between forward-backward mutation and drift. The stochastic mutation matrix for four bacteria is presented in Appendix 3; m is the forward and backward mutation rate. This matrix is sufficiently complicated that to obtain its spectral analysis is OSA; we worked it by computer analyses. Mutation is a discrete event; it is the base of evolution (including fusion or combination of genomes), thus evolution is always (at the micro level) a jumping or saltation process. With continuous equations the predictions for the distribution of populations according to their base frequencies in a site at equilibrium when there are only mutation and drift depend on N. If N < [1/(4m)] a U shaped distribution of populations (distribution frequencies) is expected. If N = [1/(4m)] the expectancy is a uniform distribution. If N > [1/(4m)] a bell-shaped distribution is expected (Wright, 1931, 1969; Crow and Kimura, 1970; Jacquard, 1970; Nei, 1987). Table 4 presents the probabilities of finding different populations (different fA) at equilibrium between forward-backward recurrent mutation of equal rates and variation of fA by drift. Our program is restricted by N (less than 185), thus we vary m and work with N = 32. Tables are preferred to figures because very small differences (10^{-3} to 10^{-5}) show important properties of the system. These are unrealistic mutation rates (too high), but population sizes are even less realistic; a test tube may have 10^{8} bacteria. In the following NTE and N-NTE are tested by extrapolating results proportionally with 32 bacteria and a given variable m to population of bacteria living in a lake or sea (N above 10^{10}).
TABLE 4 Expected probability distribution of populations according to the frequency of A (fA). N = 32 individuals, m = 16/32; 4/32; 1/32; 1/(4x32); 1/(16x32) and 1/(64x32). Only half (fA from 0.5 to 1.0) of the distribution is presented (the distribution is symmetrical) |
Case when m = 16/N (0.5). Populations with loss or fixation of A (remember they are transitory states) happen with probability (P) 2.3x10^{-10} and the six most extreme classes happen with probabilities less than 10^{-4}. The class with fA = 0.5 happens with P = 0.13975; as expected this is a bell-shaped curve; thus NTE and N-NTE are refuted, because in a lake, sea or ocean we found populations of unicellular organisms of the order of 10^{12}-10^{18} individuals with mutation rates between 10^{-6} to 10^{-8} mutants/site/generation. With N=128 and m =16/128 the classes fA = 0 or 1 occur with probability 10^{-28} and the class fA = 0.5 happens with P = 0.05486; NTE and N-NTE are refuted because most populations of unicellular species have 98% of sites fixed with one base. Case N = 32, m = 4/N; populations where A is lost or fixed occur with P = 1.6x10^{-7}; populations with fA = 0.5 occur with P = 0.10876 and polymorphic populations occur with P = 0.9999968; a bell-shaped distribution, but the tendency to a plateau rather than an edged shaped is seen at the center and top; NTE and N-NTE are refuted. Case N = 32, m = 1/N (or m is similar to the inverse of N, this is the proposal of the N-NTE); loss or fixation occur with P = 0.00032; for fA = 0.5, P = 0.0634; it is a bell-shaped curve with a plateau at the top; polymorphic states occur with P = 0.99936; the polymorphism of all the bases at all the sites is the rule; NTE and N-NTE are refuted. Case N = 32, m = 1/(4N), as expected there is a nearly uniform distribution, but the maximum values are found for loss and fixation (P = 0.03246) and a very plated bell is seen in the internal classes with maximum at fA = 0.5 (P = 0.03063); transitory fixation of A or B are seen with P = 0.06492 and polymorphic populations with P = 0.96937; NTE and N-NTE are refuted. Case N = 32, m = 1/(16xN); fixed A or B occur with P = 0.4253 and polymorphic populations with P = 0.5747; most populations should be found polymorphic at all the sites, this refutes NTE and N-NTE. These cases are produced by mutation and drift alone, where predictions of NTE and N-NTE are expected to have maximal validity. Case N = 32, m = 1/(64x32); fixation or loss occur with P = 0.3961; polymorphisms occur with P = 0.2078; NTE and N-NTE are refuted with real populations.
Mutation, Drift and Selection. Before adding selection, we should refer to errors due to the application of math to fitness (here a more conceptual and formal treatment than in the Introduction is performed). Most models work with relative fitness. Here, we are going to work with absolute fitness. We remark that for the STE fitness is variable and not fixed as in the NTE and N-NTE. Also alleles or bases are ephemeral for STE and permanent for NTE and N-NTE. Selection is a process of differential fitness due to genetic factors; in general, selection has been mathematically expressed as a coefficient that is the complement to 1 of the relative fitness. As we saw, with absolute fitness or selection NTE and N-NTE cannot be sustained. The fixation of an absolute negatively selected base leads to the extinction of the group, population, species or taxon; biotically (in the absolute sense), its probability of fixation is always zero. Only positively selected genes or bases are viable; because neutral bases produce unstable populations that decrease their size by irreversible contingent population reductions. To counterbalance adverse contingencies populations need individuals with absolute fitness over 1 (which implies exponential growth). The dynamics of populations is a continuous fluctuation of growth and decrease due to genomic and environmental interactions where genomic factors must be warranted by fitness over 1 to compensate either mild or severe environmental changes, whether continuous, intermittent or cataclysmic. The conviction of neutralists that positive selection is rare came from the use of relative fitness. With absolute fitness advantageous bases (at least one) are the necessary condition. We introduced selection (Table 5) to compare models of Table 4.
TABLE 5 Expected probability distribution of populations according to the frequency of A (fA). N = 32, m = 16/32; 4/32; 1/32; 1/(4x32) and 1/ (16x32); coefficient of selection s = +0.1; fA from 0.5 to 1 |
* Expected value without selection of 0.1 in favour of A. |
Case N = 32, m = 16/32, positive selection coefficient (s) for A= 0.1. There is a slight increase in the probability of finding populations with higher frequency of A; it is evident that recurrent mutation is the more important factor. Case N = 32, m = 4/32, s = 0.1; the higher probability of populations with higher fA is more evident, but still the action of mutation rates is predominant, the decrease in the probability of the class 16 A. Case N = 32, m = 1/32, s = 0.1 is remarkable; the distribution without selection was uniform; selection clearly distorts this distribution by increasing the probabilities of populations with higher fA. Case N = 32, m = 1/(4x32), s = 0.1; the deviation towards higher fA is evident, the more frequent populations are those with 21 A or more; the bell is replaced by a J distribution. Case N = 32, m = 1/(16x32); most populations are found in the transitory fixated state (0.5622, fA = 1)). Case N = 32, m = 1/(64x32), s = 0.1; the maximum in class 32 (fA = 1) is remarkable (P = 0.8651).
Fixation and loss are impossible with recurrent forward and backward mutations. What we see is the resilient equilibrium of frequencies near 1 or 0. Neutralists read the articles of Wright (1931, 1969) and Feller (1951) stating that fixation and loss were impossible under recurrent mutations and took biasedly only the references to the case of the destiny of one mutant (Kimura, 1968, 1991; King and Jukes, 1969; and practically all the other subsequent articles) with the effect of drift isolated of synchronous recurrent mutation. The confusion between fixation and loss with the respective absorption states of stochastic matrices is a regrettable misapplication of math to biotic processes.
ORDER, DISORDER, RANDOMNESS, CHAOS, ENTROPY, CONTINGENCY, NOMOLOGY AND IDIOGRAPHY.
In our treatment there is an underlying epistemic disagreement with current studies. For us, "random" factual processes and entropy are perfectly ordered processes, and they constitute, very often, resilient states from which the processes cannot differ too much. Randomness is the basis for statistics and the calculation of significance; it generates exact stochastically predictable results. Entropy is the tendency of universal processes to go from "non-randomly" to "randomly" ordered conditions. Disorder or Ontic-Chaos does not exist (OSA); the occurrence of anything, at any place, in any circumstance and condition is not a feature of nature. The gnosic-chaos (the impossibility to know a trajectory, http://en.wikipedia.org/wiki/Chaos_theory, butterfly effect) is not chaos, the name for these processes is regrettable. Nature has regularities or nomology (laws), but it has also idiographic properties irreducible to nomology (initial historical conditions, relative positions, specific compositions and functions, etc.). Sudden or catastrophic events are often assumed to be random events; this is an error, they are contingent events that occur rarely at random. Kimura (1991) mentioned that Prof. Wright thought of randomness or drift shifting biotic systems among adaptive peaks; this may seldom occur, the majority of shifting is due to contingency. Prof. Wright was wrong at this point because randomness has its own resilient equilibrium and yields reversible processes (fluctuations between adaptive peaks). Thus, drift may be a physical (second law of thermodynamics) but not a biotic directional factor, because it is not in the direction of life; a solution in chemical equilibrium (Brownian motion) cannot generate or maintain life.
INTERACTIONS AND PERIODICITIES OF NUCLEOTIDES
To know the physical co-adaptation of a base with all the others (residual genome) of a genome we studied the interaction of bases in pairs of nucleotides separated by 1, 2, K sites. We showed that any known nucleotide site is mostly maintained, in populations, in resilient mutation-selection equilibrium with random variations around it (the core of STE). Sites are not alone nor independent of one another. The neutral expectancy is no interaction at all. However, we see strong interactions among them; these interactions occur in transcription and in the genetic code showing co-adaptation. However, we can search for direct interactions among sites without reference to protein coding or regulatory functions. Any new mutant base is expected to be tested by the whole residual genome for its acceptability. A change of base is a pleiotropic event affecting several phenotypes: velocity of replication, transcription, degradation of DNA and its products, structural changes of DNA and RNA, recognition macromolecules interacting with the changed site, changes in protein coding or regulatory functions, and so on. These changes are inherited Mendelianly, Lamarckly or otherwise; mutations are acquired and inherited transmitted (Lamarck); the influence of a base in the velocity of replication or transcription is not necessarily Mendelian, nor Lamarckian. High non-random internucleotide interactions are then expected as they were found in Bernardi's isochors (Bernardi, 1993, 2007) and Karlin's signatures (Karlin and Mrazek, 1997).
We found strong interactions among the nucleotides of a dinucleotide when they were separated by 0 (consecutive), 1, 2 . K nucleotide sites (Valenzuela, 2009, 2010, 2011, 2012; Valenzuela et al., 2010). Also we found a periodicity every 3 sites not related to protein coding functions, in prokaryote DNA and mtDNA but not in eukaryote DNA (we recently found periodicities in eukaryotes, Solar et al., 2012). Selective periodicities and interactions (periodicities and interactions measured by the distance to randomness) were found for nucleotides separated by more than 1,000 sites, indicating a generalized inter-nucleotide co-adaptation. Table 6 shows the analysis of a DNA segment of the archeae Methanobrevibacter smithii (Valenzuela, 2012; AN = CP000678, REGION: 249362255559, 6198bp). The analysis of dinucleotides whose bases are separated by 37 sites is performed (Valenzuela, 2009, 2010, 2011, 2012). We measured evolutionary fitness, positive, negative and neutral selection by comparing with the expected random proportion of dinucleotides calculated with the frequency of bases of this DNA segment. Here, phenotypes are the class of dinucleotide, the absolute fitness equal to 1 is given by the expected frequency of the pair (the random expected genome of the bacterium), below and above this value the pair is negatively and positively selected, respectively. We can assume that this genome is like the present one since it occurred no less than 100,000 years ago or during 10^{8} prokaryote generations (1,000 replications a year), or 100 mutational cycles. During this period, an average of 100 substitutions have occurred at any site, and all the nucleotide pairs have been produced; some of them have been positively (more observed than expected) and negatively (less observed than expected) selected. From a chemical classification 8 pairs (dinucleotides) are homologous (purine-purine or pyrimidine-pyrimidine) pairs and 8 pairs are heterologous (purine-pyrimidine or pyrimidine-purine) pairs. The 8 homologous and the 8 heterologous pairs were positively and negatively selected, respectively; there are 12,870 forms of distributing 8 + signs and 8 - signs in 16 pairs, only one coincides with the 8 homologous and 8 heterologous pairs (probability is P = 1/12,870 = 0.000078) a highly significant nonrandom event. This completes the refutation of NTE and N-NTE, whose proposal that positive selection was rarely seen and few bases where acquired and maintained by either positive or negative selection of considerable force. We see that all the values for the coefficients of selection move between -0.4 to +0.4; purifying selection was not seen (but this is not the most appropriate test to see it); nor were small selection coefficients found. This agrees with and is a good example of equilibrium between mutation and selection as the STE proposes. Moreover, this is a clear demonstration of nucleotide co-adaptation throughout all the sites of this prokaryote DNA segment. Thus the most important selector for a mutant base is the residual genome that at last "decides" its degree of acceptance or rejection. This result is in complete contradiction with the vision of Kimura (1979) "The laws governing molecular evolution are clearly different from those governing phenotypic evolution". Table 6 is not in complete agreement with the STE, but it is not in contradiction with it. It is in contradiction with the Darwinian view because selection is mostly endogenous and endogenous processes of selection were not treated by Darwin (this required Mendelian genetics). The previous analyses and Table 6 show that at any site there is a resilient equilibrium of base frequencies co-adapted to their residual genome, where each site is in turn in a resilient equilibrium, in a tense dynamic process of turnover adapted to environmental conditions. The genome is also an integrated phenotype (Bernardi, 2007). This tense dynamic system is exquisitely sensitive to environmental changes. If malaria is introduced in a human population, one hundred generations should be sufficient to appreciate a significant level of the sickle cell anemia gene. Without this evolutionary tension, neutral evolution does not allow this fast population response. The neutralist vision of lethal bases as a source of purifying selection, and in general of selection as a negative force in evolution is contradictory with this present vision. From our perspective, any mutant is an invaluable trial element of variability, screened by the residual genome, and incorporated in the population through individuals whose fitness may be of any value, including zero. This vision introduces the fitness of groups, populations, species or taxa considered as a whole. On the other hand we know from clinical genetics that the lethality of a mutant base or allele is relative and depends on the residual genome; variable expressivity and incomplete penetrance are rather the rule than exception. Any pleiotropic phenotype effect of a mutant results from the complex interaction with the whole genome and its environment. Thus, recessiveness in diploid organisms and incomplete penetrance are also mechanisms to protect transitory lethal or semilethal mutants conserving variability in the population; they are, then, mechanisms of population adaptation and a milestone in evolution and polymorphism maintenance.
PHYLOGENIES AND THE MOLECULAR CLOCK
A stochastic constant velocity of substitutions (presented erroneously measured by fixations found in a set of phylogeny data) has been assumed to be the definitive proof of neutral evolution. This assumption is founded in several irreparable errors. We should keep in mind that phylogenies are constructed with molecular or more organized taxonomic traits that are "fixed" (monomorphic) for each particular taxon in the phylogeny. Thus, traits for phylogenetic analyses are selective fixations acquired and maintained by adaptive processes integrated in the respective genomes for different and unknown number of generations. 1) The inclusions of fixations in a phylogeny without considering that they are adaptive traits is an error that creates a circular epistemic thinking, because this procedure assumes that they are neutral traits (to be a plant is as adaptive or selective as to be a human, but this is not considered in the analyses). 2) Fitness, selection or adaptation cannot be studied among taxa that do not share a genomic pool; it is absurd to compare the fitness of ants, oaks, streptococci and humans because they are all alive and adapted (so endo-adapted) to their respective environments (or genomes). To pretend that phylogenies so constructed inform on the neutrality or selectiveness to be ant, oak, streptococcus or human is ridiculous. 3) The different traits used to construct the phylogeny have remained selectively fixed during very different unknown numbers of generations; to consider them as all qualitatively and equally fixed is a big error to estimate precisely the time during which they have been fixed; this is to mix ants, oaks, streptococci and humans in the same bag. This procedure throws a black India ink cloud on selective or neutral processes that lead the different used traits to substitute other ones and remain fixed until their analysis. If all the selective processes are forcedly ignored, the conclusion on their neutrality is a trivial consequence. 4) A molecular clock and the velocity of historical changes may be calculated or estimated regardless of the selective or neutral condition of the involved traits (historically), or their selective value should be estimated according to the number of descendants. An average can be always obtained. 5) Neutral evolution excludes the punctuated equilibria model (Gould and Elredge, 1977), because in stasis, gradual or non-evolutionary events are produced, but during punctuation a process of fast evolution happens. 6) Neutrality of evolution cannot be tested by the average but by the variance of the time of acquisition of taxonomic traits; in neutral evolution the mean should be equal to the variance (a Poisson distribution), but studies have revealed systematicly larger variances than means (Ayala et al., 1996; Ohta, 1992; Ohta and Gillespie, 1996; Ayala, 2000; Nei, 2005; Bedford and Hartl, 2008, Nei et al., 2010). 7) Neutral (random) evolution implies full reversibility, because in a site the four bases have an equal probability to be found; the conversion of prokaryotes in eukaryotes is as probable as the inverse process. Reversibility has never been found in the biotic world, we see the adaptive process of convergence instead. No phylogeny is possible under neutral evolution, except a complete random one without root. 8) Phylogeny analyses do not often weigh for similarities, they analyze differences; they should express differences weighed by genome similarities. As we saw (Introduction) phylogenies describe the shifts of adaptive peaks of taxa (Wright, 1969, 1988; Seaborg, 2010) rather than differences in fixed nucleotides. Criticisms to Wright's adaptive peaks have considered them as a wrong metaphor (Pigliucci, 2008); however, these articles do not consider that adaptive peaks as resilient equilibrium systems are physical and biotic processes, not metaphors, no mention of resilient equilibria is given in this article. Phylogeny nodes are shifts of the adaptive landmarks in the adaptive landscape. If resilient equilibrium between selection and mutation of the four bases is kept in all the genome sites by hundreds or thousands of different contingent selective factors, this system can, by the central limit theorem, simulate a random molecular clock, be irreversible and give stable phylogenies.
DO WE NEED A NEW SYNTHESIS FOR EVOLUTION?
Most conceptualizations in this article are in the STE and some in the NTE and N-NTE. Some ideas are new or not fully incorporated in any theory. 1) Natural selection is a process independent of evolution; it may lead to changes, to the maintenance of structures, organizations or equilibriums or to other evolutionarily independent results. 2) The main role of natural selection in evolution is not the "creation of variability" that is always due to mutation and the environmental conditions, but to maintain the acquired organizations of living beings. Accepted variation is also a role of natural selection but if and only if it contributes to the maintenance of life, as for example, we are living beings, eukaryote, multi-cellular, animal, chordate, mammal, primate, Homo sapiens; any of our babies should maintain all these and several other stable characters. Evolution is by far more maintenance than variation. 3) The measure of selection or fitness by the number of descendants is not always true; the dynamic velocity of reproduction according to genotypes should be considered. 4) In living beings nothing is neutral; they are improbable beings that need to fight continuously against entropy; any base should collaborate in their life maintenance or induce the individual to cancer, aging or genome death. 5) Genomes are unstable during individual lives, but more stable in the phylogeny; this reality deserves to be studied in depth. 6) In a wide sense mutation includes genome fusion and fission, transversal genome integration (Valenzuela, 2002a, b; Frias-Lasserre, 2012); Darwin's tree of life is a part of the evolutionary process; we should add intermixing branching processes (the tree-network of life or biotic tree-network). 7) The STE assumed that environmental changes are the main selective factor; this is partially true; the most important selector for a locus or site is the residual genome. Mutations (including simultaneous mutations) interacting with the residual genome in the same environment may create new genomes with equal or greater fitness than the previous ones (optimizing mutations) and replace those previous ones. Some authors think that epigenetics, coding or non-coding (for protein synthesis and its regulation) segments, miRNA and other important processes recently discovered should be added to the list of new evolutionary processes. However, all these processes depend finally on the resilient equilibrium of DNA sites, because if bases or mutants are ephemeral in individuals, these factors are still more ephemeral during the cell cycle or individual development. For example, some epigenetic processes depend on the specific site of methylation and on methylases that are DNA coded. Whether these new evolutionary features deserve a new synthesis is a fascinating debate. According to the previous analyses the answer is no, but we need a lot of research on idiographic historical evolutionary events; the STE with this molecular approach is mostly sufficient.
REFERENCES
AYALA FJ (2000) Neutralism and selectionism: the molecular clock. Gene 261: 27-33. [ Links ]
AYALA FJ, BARRIO E, KWIATOWSKI J (1996) Molecular clock or erratic evolution? A tale of two genes. PNAS (USA) 93: 11729-11734. [ Links ]
BAER CF, MIYAMOTO MM, DENVER DR (2007) Mutation rate variation in multicellular eukaryotes: causes and consequences. Nature Reviews/Genetics 8: 619- 631 [ Links ]
BEDFORD T, HARTL DL (2008) Overdispersion of the molecular clock: temporal variation of gene-specific substitution rates in Drosophila. Mol Biol Evol 25: 1631- 1638. [ Links ]
BERNARDI G (1993) The vertebrate genome: isochores and evolution. Mol Biol Evol 10: 186-204. [ Links ]
BERNARDI G (2007) The neoselectionist theory of genome evolution. PNAS 104: 8385-8390. [ Links ]
BROCCHIERI L, KARLIN S (2005) Protein length in eukaryotic and prokariotic proteomes. Nuc Ac Res 33: 3390-3400 [ Links ]
CROW JF (1995) Motoo Kimura (1924-1994). Genetics 146: 1-5. [ Links ]
CROW JF AND KIMURA M (1970) An Introduction to Population Genetics Theory. Harper and Row, New York, NY. [ Links ]
FELLER W (1951) Diffusion processes in Genetics. 2^{nd} Berkeley Symposium on Mathematics, Statistics and Probability, University of California Press, pp 227-246. [ Links ]
FELLER W (1968) An introduction to probability theory and its applications, 3^{rd} edition. John Willey & Sons, New York, NY, pp 38-42. [ Links ]
FRIAS-LASSERRE D (2012) Noncoding RNAs and viruses in the framework of the phylogeny of genes, epigenesis and heredity. Int J Mol Sci 13: 477-490. [ Links ]
GOUET R, LÓPEZ-FENNER J (1985) Modelo markoviano de evolución cromosómica. Publicaciones Técnicas MA-85-B-328, Departamento de Ingeniería Matemática, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile. [ Links ]
GOUET R, LÓPEZ-FENNER J (1986) Evolución markoviana de un cariotipo. Rev Soc Chil Estad 3: 1-25. [ Links ]
GOULD SJ (2002) The structure of evolutionary theory. Belknap Harvard, Cambridge, MA (USA). [ Links ]
GOULD SJ, ELDREDGE N (1977) Punctuated equilibria : The tempo and mode of evolution reconsidered. Paleobiology 3: 115-151. [ Links ]
HARTL DL, CLARK AG (2007) Principles of Population Genetics, 4th Ed. Sinauer, Sunderland, MA, USA [ Links ]
JACQUARD A (1970) The genetic structure of populations. Biomathematics Volume 5. New York, Springer -Verlag, pp 388-419 [ Links ]
KARLIN S AND MRAZEK J (1997) Compositional differences within and between eukaryotic genomes. Proc. Natl Acad Sci USA 94:10227-10232. [ Links ]
KIMURA M (1957) Some problems of stochastic processes in Genetics. Ann Math Stat 28:882-901. [ Links ]
KIMURA M (1968) Evolutionary rate at the molecular level. Nature 217:624-626. [ Links ]
KIMURA M (1979) The neutral theory of molecular evolution. Sci Am 241:94-104. [ Links ]
KIMURA M (1991a) Recent development of the neutral theory viewed from the Wrightian tradition of theoretical population genetics. Proc Nat Acad Sci (USA) 88:5969-5973. [ Links ]
KIMURA M (1991b) The neutral Theory of molecular evolution: A review of recent evidence Jpn J Genet 66: 367-386. [ Links ]
KIMURA M (1993) Retrospective of the last quarter century of the neutral theory. Jpn J Genet 68:521-528. [ Links ]
KIMURA M, OHTA T (1969a) The average number of generations until fixation of a mutant gene in a finite population. Genetics 61: 763-771. [ Links ]
KIMURA M, OHTA T (1969b) The average number of generations until extinction of an individual mutant gene in a finite population. Genetics 63: 701-709. [ Links ]
KIMURA M, OHTA T (1973) The age of a neutral mutant persisting in a finite population. Genetics 75: 199-212 [ Links ]
KING JL, JUKES TH (1969) Non-Darwinian evolution. Science 64:788-798. [ Links ]
LI WH (1997) Molecular Evolution. Sinauer Associates, Sunderland, pp 5978. [ Links ]
NEI M (1987) Molecular Evolutionary Genetics. New York, NY: Columbia University Press. [ Links ]
NEI M (2005) Selectionism and neutralism in molecular evolution. Mol Biol Evol 22:2318-2342. [ Links ]
NEI M, SUZUKI Y, NOZAWA M (2010) The neutral theory of molecular evolution in the genomic era. Annu Rev Genomics Hum Genet. Jun 21 (on line) PMID: 20565254. [ Links ]
OHTA T (1992) The Nearly Neutral Theory of molecular evolution. Annu Rev Ecol Syst 23:263-86 [ Links ]
OHTA T (2002) Near-neutrality in evolution of genes and gene regulation. PNAS 99: 16134-16137 [ Links ]
OHTA T (2003) Origin of the neutral and nearly neutral theories of evolution. J Biosci 28: 371-377 [ Links ]
OHTA T, GUILLESPIE JH (1996) Development of Neutral and Nearly Neutral Theories. Theor Popul Biol 49:128-142. [ Links ]
PIGLIUCCI M (2008) Sewall Wright's adaptive landscapes: 1932 vs.1988. Biol Philos 23: 501-603. [ Links ]
SEABORG DM (2010) Was Wright right? The canonical genetic code is an empirical example of an adaptive peak in nature; deviant genetic codes evolved using adaptive brigges. J Mol Evol 71: 87-99. [ Links ]
SOLAR H, TOBAR S, TORRES R, VALENZUELA CY (2012) Evolución molecular: Periodicidades en Archaea, Bacterias y Cromosoma 21 humano. XX Congreso de las Unidades de Investigación, 2° Año de Medicina, Facultad de Medicina, Universidad de Chile, Santiago, Chile, 6 de Diciembre de 2012. [ Links ]
SPIEGEL MR (1980) Probability and Statistics. Schaum's Outline Series. McGraw-Hill Book Company. Singapore. Pp, 114, 133. [ Links ]
SUEOKA N (1995) Intrastrand parity rules of DNA base composition and usage biases of synonymous codons. J Mol Evol 40:318-325. [ Links ]
VALENZUELA CY (1979) Un modéle pour la détermination du nombre attendu de remaniements et de points de cassure chromosomiques. C R Acad Sc (Paris) t. 288: 709-712. [ Links ]
VALENZUELA CY (1985a) Análisis y ensayos conducentes a la definición de una Citogenética Poblacional. In: Fernández-Donoso R (ed) El Núcleo los Cromosomas y la Evolución. UNESCO, Santiago Chile, pp 123-134. [ Links ]
VALENZUELA CY (1985b) Algebraic and epistemological restrictions in studies on Hardy-Weinberg equilibrium. Am Nat 125: 744-746. [ Links ]
VALENZUELA CY (1994) Epistemic restrictions in population biology. Biol Res 27: 85-90. [ Links ]
VALENZUELA CY (1997) Non random DNA evolution. Biol Res 30: 117-123. [ Links ]
VALENZUELA CY (2000) Misconceptions and false expectations in neutral evolution. Biol Res 33: 187-195. [ Links ]
VALENZUELA CY (2002a) A biotic Big Bang. In: Palyi G, Zucchi C and Caglioti L (eds) Fundamentals of Life. Elsevier, Paris, pp. 197-202. [ Links ]
VALENZUELA CY (2002b) Does biotic life exist? In: Palyi G, Zucchi C and Caglioti L (eds) Fundamentals of Life. Elsevier, Paris, pp. 332-334. [ Links ]
VALENZUELA CY (2007) Within selection. Rev. Chil. Hist. Nat. 80:109-116. [ Links ]
VALENZUELA CY (2009) Non-random pre-transcriptional evolution in HIV-1. A refutation of the foundational conditions for neutral evolution. Genet Mol Biol 32: 159-169. [ Links ]
VALENZUELA CY (2010) Internucleotide correlation and nucleotide periodicity in Drosophila mtDNA: New evidence for panselective evolution. Biol Res 43:497-502. [ Links ]
VALENZUELA CY (2011) Heterogeneous periodicity of drosophila mtDNA: new refutations of neutral and nearly neutral evolution. Biol Res 44: 283-293. [ Links ]
VALENZUELA CY (2012) Periodicidades e interacciones del DNA. El fin del neutralismo y del casi neutralismo. In: Veloso A and Spotorno A (eds.) Darwin y la evolución. Editorial Universitaria (Santiago, Chile), Santiago, pp 189-295. [ Links ]
VALENZUELA CY, HARB Z (1977) Socioeconomic assortative mating in Santiago, Chile: a demonstration using stochastic matrices of mother-child relationships applied to ABO blood groups. Soc Biol 24:,225-233 [ Links ]
VALENZUELA CY, LÓPEZ-FENNER J (1986) Parámetros Cromosómicos y estadística de Bose-Einstein. Arch Biol Med Exper 19:135. [ Links ]
VALENZUELA CY, SANTOS JL (1996) A model of complete random molecular evolution by recurrent mutation. Biol Res 29: 203-212. [ Links ]
VALENZUELA CY, FLORES SV, CISTERNAS J (2010) Fixations of the HIV-1 env gene refute neutralism: new evidence for pan-selective evolution. Biol Res 43:149- 163. [ Links ]
VASCO C, MANTEROLA M, PAGE J, ZUCCOTTI M, DE LA FUENTE R, REDI CA, FERNANDEZ-DONOSO R, GARAGNA S (2012) The frequency of heterologous synapsis increases with aging in Robertsonian heterozygous male mice. Chrom Res DOI 10.1007/s10577-011-9272-x. [ Links ]
WRIGHT S (1931) Evolution in Mendelian populations. Genetics 16: 97-159. [ Links ]
WRIGHT S (1932) The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proc 6th Int Congr Genet 1:356-366. [ Links ]
WRIGHT S (1968) Evolution and the genetic of populations. Volumes 1, 2, 3 and 4. The University of Chicago Press, Chicago. [ Links ]
WRIGHT S (1988) Surfaces of selective value revisited. Am Nat 131: 115-123. [ Links ]
Formula for the resilient equilibrium between selection and mutation. In a population of bacteria where the frequency of A is p, the frequency of T+G+C is 1-p (q), the mutation rate from A to T+G+C is u and the mutation rate from T, G or C to a is v = u/3. The selection coefficient is s, and we examine the positive and the negative cases both in favor of A. We have in the generation G_{0} that fA = p_{0}, f(T+G+C) = 1-p_{0} = q_{0}
Case of selection against T+G+C
In G_{1}, p_{1} = p_{0}(1-u) + vq_{0}; q_{1} = [q_{0}(1-v)](1-s)+up_{0}(1-s)];
p_{1} = p_{0}-up_{0} +vq_{0} ; q_{1} = q-vq_{0}-sq_{0} +svq_{0} + up_{0}-sup_{0}
The new total is p_{1} + q_{1} = p_{0}-up_{0}+vq_{0} + q_{0}-vq_{0}-sq_{0}+svq_{0}+up_{0}-sup_{0}, since p_{0}+q_{0} = 1 and reducing q_{0} to p_{0}: p_{1}+q_{1} = 1-s[1-p_{0}(1-u-v)-v]; then p_{1} = [p_{0}(1-v-u)+v]/{1-s[1-p_{0}(1-u-v)-v]}; the equilibrium frequency p_{e} is found when p_{1} = p_{0}. Solving this quadratic equation we have
(1) p_{e} = {[s-u-v-sv]+/[4sv(1-u-v)+(s-u-v-sv)^{2}]}/[2s(1-u-v)]
For positive selection in favor of A we have
p_{1} = p_{0}-up_{0} +vq_{0}(1-s) =q_{1}-vq_{0}+uq_{0}; solving as previously
(2) p_{e} = {[s-2sv-u-v-su]+/[4sv(1+s)(1-u-v)+(u+v+su+2sv-s)^{2}]}/ [2s(1-u-v)]
(1) and (2) give very similar results, but the value of s (-s) in (1) cannot be >1.
TABLE OF APPENDIX 1 Resilient mutation-selection equilibriums of base frequencies according to the synthetic theory of evolution (positive selection of A) |
* Approximation to multiple of 3. m = μ = 0.000003; υ = 0.000001. The tendency to these resilient equilibriums cannot be changed by drift. |
Stochastic Variation Matrix for the Model |
Corresponding author: Carlos Y Valenzuela. Programa de Genética Humana. Instituto de Ciencias Biomédicas (ICBM), Facultad de Medicina, Universidad de Chile, Independencia 1027, Casilla 70061, Independencia, Santiago, Chile. Fax (56-2) 7373158; Phone (56-2) 9786302. E-mail:cvalenzu@med.uchile.cl
Received: April 18, 2012. In Revised form: September 16, 2008. Accepted: March 11.2013