## Services on Demand

## Journal

## Article

## Indicators

## Related links

- Cited by Google
- Similars in SciELO
- Similars in Google

## Share

## Biological Research

##
*Print version* ISSN 0716-9760

### Biol. Res. vol.33 n.3-4 Santiago 2000

#### http://dx.doi.org/10.4067/S0716-97602000000300004

## Misconceptions and false expectations in neutral evolution

CARLOS Y. VALENZUELA

Programa de Genética Humana, Instituto de Ciencias Biomédicas (ICBM), Facultad de Medicina,

Universidad de Chile

**ABSTRACT**

Neutral evolution results from random recurrent mutation and genetic drift. A small part of random evolution, that which is related to protein or DNA polymorphisms, is the subject of the Neutral Theory of Evolution. One of the foundations of this theory is the demonstration that the mutation rate (m) is equal to the substitution rate. Since both rates are independent of population size, they are independent of drift, which is dependent upon population size. Neutralists have erroneously equated the substitution rate with the fixation rate, despite the fact that they are antithetical conceptions. The neutralists then applied the random walk stochastic model to justify alleles or bases that were fixated or eliminated. In this model, once the allele or base frequencies reach the monomorphic states (values of 1.0 or 0.0), the absorbing barriers, they can no longer return to the polymorphic state. This operates in a pure mathematical model. If recurrent mutation occurs (as in biotic real systems) fixation and elimination are impossible. A population of bacteria in which m=10^{-8} base mutation (or substitution)/site/generation and the reproduction rate is 1000 cell cycle/year should replace all its genome bases in approximately 100,000 years. The expected situation for all sites is polymorphism for the four bases rather than monomorphism at 1.0 or 0.0 frequencies. If fixation and elimination of a base for more than 500,000 years are impossible, then most of the neutral theory is untenable. A new complete neutral model, which allows for recurrent substitutions, is proposed here based on recurrent mutation or substitution and drift alone. The model fits a binomial or Poisson distribution and not a geometric one, as does neutral theory.

**Key words**: * *neutral evolution, neutral theory, misconceptions, bases fixation and elimination

INTRODUCTION

A theory of molecular evolution should at least explain the origin, history, organization, maintenance, distribution, and variation of genomes and nucleotides. Evolution is a process that includes subprocesses. Neutral or random evolution is a subprocess or component of evolution produced by random mutation and drift, which is different from the theory of neutral evolution. The neutral theory (hypothesis or model) of evolution was proposed (Kimura, 1968; 1983; 1991; King and Jukes, 1969) primarily to explain the maintenance of protein polymorphisms (a very restricted subprocess of evolution)and has since been discussed and debated in a number of textbooks on evolution (Nei, 1987; Strickberger, 1996; Li, 1997; Futuyma, 1998). A complete model for neutral evolution as a process should deal not only with polymorphic loci or sites, but also with the evolution of all the nucleotide sites of genomes, whether or not they are related to proteins.

In this article, we will consider only nucleotide sites. One of the foundations of the neutral theory model is the finding that the rate of base substitution in a nucleotide site is equal to the rate of mutation (m) in that site. Thus, polymorphisms within populations could be explained as a transient state of bases going to fixation (Crow and Kimura, 1970; Nei, 1987; Li, 1997). Differences in taxa were explained by the fixation of different bases among them (Kimura 1968; King and Jukes, 1969; Li, 1997; Nei, 1987). We will analyze this finding and show some important misconceptions and erroneous expectations that are widely accepted today.

Before presenting the analysis, we must first define randomness. It cannot be done by examining present proteins, DNA segments, amino acid or nucleotide compositions, because this method includes the history of selection or neutrality of these proteins, DNA, amino acid or bases (epistemic circularity). Means, variances and other parameters as well as the evolutionary features should be logically and biologically deduced from the previously-defined properties of the model (assumptions, axioms), without regard to data. Otherwise, we are prone to constructing an "ad hoc" model to fit those data, with the obvious result that data will fit the model very well. Unfortunately, this has been the most frequent method of dealing with the subject (Kimura, 1968; King and Jukes, 1969). We define randomness stochastically by assuming isotropy in the occurrence of events, in other words, a uniform probability distribution for alternatives of the same class, for example, the rate of adenine (A) mutation is equal for all A in that genome (or genome segment).

FEATURES AND PROPERTIES OF A NEUTRAL EVOLUTIONARY SYSTEM

Let N=10^{10} individuals be a population of bacteria whose genome has 10^{6} nucleotide sites growing in a steady state with a reproduction rate of 1000 duplications per year. The mutation rate is m=10^{-8} mutations per site per cell duplication (DNA replication or generation). Let us now examine a generation with its cohort of mutations. In any population duplication, 2x10^{10} cells appear from which there are an average of 2mN=200 mutant bases in each of the 10^{6} nucleotide sites. Half of these cells suffer random death; thus the number of bacteria is reduced to N and mutants to an average of 100 (mN) per site in all the genome sites. This simple fact has not yet been taken into account. Nucleotide sites must co-evolve (in this case randomly, or with zero biotic covariance). These 100 mutants per site begin their way to substitute (not fixate) the original base or the way of their disappearance (not elimination) in the population. There are mN mutant/site/generation, and the probability for one neutral mutant to reach the monomorphic state at frequency 1 (M_{M}1 or a substitution) is (1/N) substitution/mutant, so the substitution (not fixation) rate of the original base by one of these mutants is mNx1/N=m substitution/ site/generation; the known mentioned finding (Kimura, 1957; 1968; King and Jukes, 1969; Crow and Kimura, 1970). The dimension of the product is often forgotten, causing an a-dimensional misconception. As the mutant base went from the monomorphic state with frequency of 0 (M_{M}0) to M_{M}1, the original base simultaneously went from the monomorphic state M_{M}1 to M_{M}0 via a transient polymorphic state (P_{M}). Several implications can be deduced from this analysis:

1) Due to isotropy, m is the rate of substitution of every original base of the species' genome (or for all DNAs whose mutation rate per site is m). In our population, 100 mutants are produced in any generation at all of the one million genome sites. Thus, each of the genome bases is equally substituted at the same rate (or at its own m). As they change randomly, the bases of a DNA or RNA neighborhood or a complete genome should be randomly distributed longitudinally. Testing longitudinal random distribution of bases has refuted neutral evolution (Valenzuela, 1997; Bernardi et al., 1997; Karlin et al., 1997; Karlin and Mrazek, 1997; Carels et al*.*, 1998).

2) The rate of substitution is independent of the population size (N); this is a direct and well-known result.

3) Directly and as a corollary, the rate and probability of substitutions are independent of genetic drift. If drift depends on N, and the substitution rate does not depend on N, then the substitution rate does not depend on drift. That is the main feature of evolution; the base turnover does not depend on drift at all, a result that has been avoided in most, if not all, studies. Drift cannot drive evolution; mutation or selection can. Drift may go up or down, left or right, or even back and forth, but its evolutionary movement is zero. This is a widely-held biological notion. We teach that "drift is a non-directional evolutionary factor," yet it seems we do not believe it. We tend to give more priority to mathematical models, often at the expense of biology.

4) The rate of substitution does not depend on the taxon where nucleotide sites are located. This is true for taxa with a similar rate of mutation and reproduction. In this case, the evolutionary fate of a genetic site can be considered from its origin through the present, regardless of the taxa it has belonged to. If reproduction or mutation rates differ among taxa, then estimating an average rate of substitutions from these taxa and then testing whether or not they are distributed randomly is an epistemic error that invalidates the result. Moreover, we do not know if the different reproduction or mutation rates or allele fates were randomly or selectively acquired. History cannot be unambiguously deduced from current results. Parameter estimates from these data will include all the evolutionary factors, and their variance should be increased, thereby indicating that neutralist models should be favored when testing them (for a demonstration of this proposition see Valenzuela, 1994).

5) The rate of substitution is equal to the rate of mutation, although it is quite different from the rate of fixation. A great error was introduced into comparative studies of taxa in estimating the rate at which alleles or bases were and remain fixated and assuming that rate to be the substitution rate. An allele or base can reach M_{M}1 by selection or drift and be maintained at that frequency only by selection (not by drift, as we will see). Therefore the estimation of the fixation rate includes not only mixed mechanisms of selection and drift, but also the selective history of the maintenance from its arrival at M_{M}1 to the present as well.

6) Fixation is impossible. Using similar probabilities, numbers of generations, and the same process, when one base replaces another, this base will also be replaced. If an original base A is replaced by C, with rate m_{A-C}, C will be replaced by A with rate m_{C-A}. Thus, m (as a general rate) is the mean turnover (replacement) rate of any base in a site; it is a coefficient of transition, the antithesis of a coefficient of fixation. If a base has reached the proportion 1.0 in a population at the n^{th} generation, it will have the proportion (1-m) in the next [(n+1)^{th}] generation. Naturally, if the population is small, it should have no more than one allele or base in a site. This does not mean it is fixated, but that it has a transitory of frequency 1.0 (transient M_{M}1).

It has been argued that the probability for a new mutation and a subsequent substitution in the same site is so low that it can be ignored (Li et al. 1984). This argument is incorrect because the probability of a mutation and a subsequent substitution have already been incorporated into the formula. Others have claimed that there is not sufficient time for a second substitution in the site. The neutralists, however, argue that the average time required for one substitution (replacement) of a base (or any base, I propose) is (1/m) generations per substitution per site (conserving dimensions, this is a replacement cycle or RC), in our example, 10^{8} generations (the variance is 1/m^{2}, Nei 1987). Given that our bacteria have 10^{3} generations per year, all of the bases at all of the sites will be inexorably changed every 100,000 years, regardless the population size. Several populations and taxa of bacteria have lived for more than 2x10^{9} years, and thus have had more than 10^{4} RC.

It is not quite correct however to take 1/m as the mean number of generations for a substitution to occur. I use it here to illustrate that even from the same neutralist paradigm, it is possible to find logical inconsistencies or refutations with observed distributions. A correction will be given with a new model. On the other hand, the mean number of generations for one neutral mutant base to reach the monomorphic state in diploids by random drift alone is nearly 4N generations (Kimura and Ohta, 1969a; Crow and Kimura, 1970). In our example, that is 4x10^{10} generations (40 million years) (I found a similar result by numerical simulation in bacteria). Thus, the time required for random "fixation" (which depend on N but not on m) is, in this case, 400 times the time for a substitution (which depends on m but not on N). This contradiction was not resolved in Crow and Kimura's 1970 textbook (repeated in chapter 11 of Futuyma 1998), where figures with a limit of 2N are presented. Nor was it resolved in chapter 13 of Nei (1987), which presents figures with a limit of 1.0 (see also Li, chapter 2 (1997)).

Others argue that the rate of mutation is lower. If m=10^{-10}, then an RC occurs in 10^{7} years (more than 100 RC in the history of bacteria). All the bases of all the bacterial genomes would then be inexorably changed approximately every ten million years. Moreover, the rate of mutation has evolved from `hotspotness' toward `coldspotness' (Valenzuela and Santos, 1996), and we should add other mutations occurring through gene conversion, sexual processes, chromosome rearrangements (change of neighborhood), mutagens, and so on.

A fourth group believes that most new mutations are eliminated. While this is true, the formula also includes the probability of elimination as the complement of the fraction of mutations going to "fixation." Either a mutant base replaces the original, or it is "eliminated" (Kimura and Ohta, 1969a; 1969b).

7) Elimination is impossible. If the frequency of A (f_{A}) in the site is 0 in the n^{th }generation, it should be m_{T-A}+m_{G-A}+m_{C-A} in the (n+1)^{th} generation. Naturally, in small populations a transient f_{A} equal to 0.0 (M_{M}0) is probably observed.

If neither fixation nor elimination is possible for more than 500,000 years for all the sites or loci, and drift cannot drive evolution, then little if any of the neutral theory of evolution can be supported.

8) When examining several populations, the expected situation is a polymorphism for all sites. Imagine a site with A as the original base. In the first (any) generation we find 100 mutant bases. Considering that transversions are less frequent than transitions, we could have 20 T, 20 C and 60 G among these 100 mutants. Let us further imagine that in our population, one T reaches 100% in 100,000 years. Others similar populations should have similar substitutions, but T, C and G should be randomly distributed among them, to find 20 T, 20 C and 60 G in 100 similar populations that have evolved for more than 100,000 years, as the most probable result. It should therefore be easy to assess all of the nucleotide sites in 100 populations of bacteria isolated for more than 100,000 years to determine the original nucleotide that was present at any site 100,000 years ago. This expected situation is so far from reality that a new study is not necessary to refute the model that supports it.

Neutralists argue that they accept some sites, particularly those that have been found to be non-randomly distributed cannot be polymorphic because they have functional restrictions or structural constraints for the base variability. This argument destroys their theory in favor of the `functional or constraint theory of evolution with an undetermined fraction left for neutralist evolution.' When all the bases are found to be functional or constrained (or non-randomly distributed), there will be no room left for neutralism. Moreover, this argument assumes that functions or constraints cannot be built by random polymorphic base distributions, and this assumption is unfounded. The universe of possibilities of base combinations is so great that this is not a problem. The problem is the time and history required to do so.

We have seen that a significant error arose from the implicit and unfounded assumption that a mutation is fixated once and forever (permanently) with a low probability. A second error was the result of equating the rate of substitution (replacement) with the rate of fixation. This error was established in early articles (Kimura, 1968; 1970; King and Jukes, 1969; Crow and Kimura, 1970) and has continued into the present (Nei, 1987; Kimura, 1991; Li, 1997; Futuyma, 1998). A third misconception occurred because the fate of only one mutant base or allele (for which the analyses are correct) was assumed to be the entire evolution of the site. A fourth came about from failing to consider the co-evolution of nucleotide sites. These errors were demonstrated using only one cohort of mutations using the same neutralist framework; they are even more evident when analyzing a trend of mutation cohorts.

A NEW MODEL BASED ON RECURRENT MUTATION

The following analysis is a further development of the Valenzuela and Santos (1996) model. The neutralist finding that m was the substitution rate was applied to one cohort of mutations (100 from the first generation, in our example). The neutralist studies stopped there, assuming from these mutants that they were permanently fixated. However, in the second generation, an average of 100 new mutations appear at any site, as is true for each successive generation. Every cohort of 100 mutants begins the process toward "fixation" or "elimination." Moreover, any new mutant can, in turn, mutate.

A new model is therefore necessary to describe random evolution with recurrent mutation, allowing for the inclusion of successive cohorts of mutants and for more than one substitution per site. In this new and complete neutral model, it is clear and definitive that fixation or elimination are impossible. As a base increases its frequency in the population, it also increases its fraction of mutants. If a base could reach the frequency 1.0, it would produce 100 mutants in the next generation; only N(1-m) of this base would remain. At the n^{th} generation from the "fixation," (1-m)^{n} should be in the population, and the limit of (1-m)^{n} is 0 as n increases. For the first generation we can denote the state of the population as N[(1-m) + (m)]^{1}, where (1-m) and (m) are the non-mutated and the mutated fractions respectively; the exponent denotes the first generation (n=1). N can be any number (including 1), so we can study the condition of the site by analyzing the notation within brackets alone. This is equivalent to the evolution of one bacterium. It is implicit from this model that substitution, mutation or replacement are equivalent. This shows even more clearly that either fixation or elimination are impossible. The probability distribution is:

P = [(1-m)+(m)]^{n} = (1-m)^{n(}m)^{0} +n(1-m)^{n-1}(m)^{1 }+...+n(1-m)(m)^{n-1} +(m)^{n}

according to binomial expansion, with mean = nm (number of expected substitutions or mutations) and variance = nm(1-m). The exponents of (1-m) and (m) indicate the number of times the site did not mutate and mutated respectively among the n generations. The binomial coefficients indicate the combinatorial for any site having (k) substitutions and (n-k) non-substitutions. The neutralist model holds that only one substitution is possible, assuming that the mutant base is then permanently fixated. Thus, its distribution is not a binomial but rather a geometric distribution where m is the probability for the first and unique mutation that closes the series due to its fixation, and (1-m) is the probability of no mutation in the series. The probability distribution is therefore given by:

P = (1-m)^{0}m + (1-m)^{1}m + (1-m)^{2}m +(1-m)^{3}m+ ...+ (1-m)^{n}m

with mean 1/m and variance (1-m)/m^{2} (approximated to 1/m^{2} in Nei, 1987). Both models are similar when n is small because the probability of more than one substitution is very small. As n increases they differ substantively. The neutralist model needed the fixation or elimination to be maintained forever to explain the differences found in protein or DNA comparative studies among taxa. The simple rate of substitution taken (erroneously) to be the rate of fixation was not sufficient because it was simply the rate of mutation. The stochastic model known as `random walk' (Cox and Miller, 1965; Feller, 1968) completed the picture. The extreme states of that model, which in our example refer to monomorphisms with frequencies of 1 and 0 (M_{M}1 and M_{M}0), were assumed to be the absorbing barriers from which the process could not return to the intermediate states, in our case, the polymorphic states (P_{M}). Crow and Kimura (1970) wrote that "equation 8.3.1 is valid only for gene frequencies in the interval 0<x<1 (unfixed classes). Therefore, separate treatments are required to obtain probabilities for x=0 and x=1 (terminal classes)... In terminology of the mathematical theory of probability, the boundaries (x=0 and x=1) act as absorbing barriers." (1970: 379). Moreover, Kimura (1970) emphasized the impossibility of mutations at boundary conditions (M_{M}1 and M_{M}0) to contribute to heterozygous sites: "Furthermore, since "mutations" at p=0 and p=1 do not contribute to the heterozygous sites..." (1970:196).

These are correct analyses for a mathematical stochastic model (and only one mutant), but they fall short of biotic reality. Recurrent mutation changes any gene frequency, regardless of its value; it replaces frequencies of 1 and reestablishes lost bases or alleles (Valenzuela and Santos, 1996; Valenzuela, 1997). The neutralist model is founded on base fixation and elimination of one mutant and the assumption that M_{M}1 and M_{M}0 are absorbing barriers of a random walk model, thus giving a great value to the mathematical model. The complete neutral model is founded on base replacement and recurrent mutation occurring over the entire history of a site; it uses mathematics as a simple descriptive tool, giving more value to biotic conditions. Both positions come from different epistemic frameworks (Valenzuela, 1994).

Turning to the complete neutral model, since m is very small and n is large, the binomial distribution can be treated as a Poisson distribution. The generic term for this distribution is e^{-r }r^{k}/k!, where r = nxm (the mean or the expected number of substitutions, mutations or replacements in the site) and k is the actual number of substitutions. The variance is also r. Let us examine the probability of the occurrence of 0, 1, 2, ... k substitutions in the site when an integer multiple of 1/m generations (100,000 years, one RC) occurs, that is, when an integer number of substitutions (r=1, 2, 3, 4...) is expected. This is presented in Table I.

Poisson proportions of k substitutions for r expected base substitutions in the nucleotide sites of theoretical bacteria.

r = | 1 | 2 | 3 | 4 | 5 | 6 |

k | ||||||

0 | 0.36788 | 0.13534 | 0.04979 | 0.01832 | 0.00674 | 0.00248 |

1 | 0.36788 | 0.27067 | 0.14936 | 0.07326 | 0.03369 | 0.01487 |

2 | 0.18394 | 0.27067 | 0.22404 | 0.14653 | 0.08422 | 0.04462 |

3 |
0.06131 | 0.18045 | 0.22404 | 0.19537 | 0.14037 | 0.08924 |

4 |
0.01533 | 0.09022 | 0.16803 | 0.19537 | 0.17547 | 0.13385 |

5 |
0.00307 | 0.03609 | 0.10082 | 0.15629 | 0.17547 | 0.16062 |

6 | 0.00051 | 0.01203 | 0.05041 | 0.10420 | 0.14622 | 0.16062 |

7 | 0.00007 | 0.00344 | 0.02160 | 0.05954 | 0.10445 | 0.13768 |

8 | 0.00001 | 0.00086 | 0.00810 | 0.02977 | 0.06528 | 0.10326 |

9 |
0.00000 | 0.00019 | 0.00270 | 0.01323 | 0.03627 | 0.06884 |

r=nm; n=number of generations; m=10^{-8 }substitution or mutation/site/ generation; n was chosen as an integer multiple of 1/m. The rate of reproduction is 10^{3} generation/ year, thus 1/m is equivalent to 10^{5} years, and r is equivalent to rx10^{5} years.

When n=1/m and r=1, the probabilities for 0, 1, 2, and 3 substitutions are 0.368, 0.368, 0.184 and 0.061 respectively. Then 1/m generations is the number of generations for which the expected number of substitutions is 1.0 and for which one substitution is the most probable result, and not the mean number of generations for a substitution, as it is for the geometric distribution. In the geometric model a substitution at the (1/m)^{th} generation is not the most probable result, but it is at the first generation (remember we are dealing with an abstract site or with only one bacterium). They coincide numerically but differ conceptually. This is the only number of generations for which the standard deviation is equal for both the Poisson and the geometric distribution. It is worth noting that in 5,000 years, or 5x10^{6} generations (r=0.05), the probability for one or more substitutions is 0.048. The probability that the site remains without mutations at the generation number 2x10^{8}, 3x10^{8}, etc., is 0.1353, 0.0498, etc., respectively. In the entire genome these are the proportions of non-substituted sites. Since we are dealing with only one bacterium (maximal possible drift), it is both clear and definitive that drift cannot change the result (a `*reductio ad absurdum*' argument).

Figure 1. Evolution of a base frequency in a population of bacteria with N individuals.

The base mutation rate m=10^{-8} mutation/site/generation and the reproductive rate is 10^{3} generation/year. In 1<N<1/m and N>e1/m, N increases from left to right.

What is the proportion of bases at equilibrium? If the original base is A, it can mutate to G, C or T in the first generation. In the second generation four bases are possible. It is shown that at the 6^{th} RC (600,000 years) values close to equilibrium are attained (Valenzuela and Santos, 1996). Equilibrium frequencies have been found by mathematical induction or by matrix methods, finding the invariant eigenvector of base frequencies. One, two, or more parameters methods have been reported (Li 1997; Valenzuela 1997). In short, in most models the equilibrium is near (1/4)A, (1/4)T, (1/4)G, and (1/4)C. Thus, after the 6^{th} RC, the equilibrium should be found for every site in all genomes. Thus polymorphism, not fixation or elimination for the four bases is the expected situation for neutral evolution. Naturally, in our example with only one bacterium there is one base in each site. Any state in a site behaves as a transient monomorphism. These are contradictory expected situations in relation to those from the neutral theory.

If we have two bacteria, a transient P_{M} appears. The evolution of these two bacteria should be a series of a short period transient P_{M} and a long period transient M_{M}1 and M_{M}0. As N increases the period of transient P_{M} increases and periods of M_{M}1 and M_{M}0 shorten. The duration of monomorphic states is inversely proportional to N. If N= 1/m (10^{8}) bacteria, this expected duration is one generation. In our population of 10^{10} cells, the expected number of generations for monomorphic states is 0.01. This polymorphism is clearly resilient around its equilibrium frequencies. As a base frequency increases over its equilibrium, the number of mutations from this base to the other bases also increases, while the number of mutations from other bases to this one decreases. The reverse is true when a base has a lower frequency than it does at equilibrium. In other words, this equilibrium system generates its own corrective forces to reduce deviations of itself. Thus, drift cannot eliminate P_{M}, it can only change its variance. Non-directionality of drift is evident. Large populations should be polymorphic for the four bases at all nucleotide sites. Small populations should fluctuate between transient M_{M}1, P_{m} and M_{M}0 at all sites. Figure 1 presents the neutral evolution of a base in a site for different ranges of N. To simplify, m is the generic rate of mutation for the base. The figures then represent the historical presence of a base throughout evolution. This presence is depicted by the area beneath the curve of its frequency. It should be close to 1/4 for each base. M_{M}1 lasts 1/3 of M_{M}0, because the figures are blind for the specific evolution of the other 3 bases. Thus, figure N=1 presents the alternation of M_{M}1, whose average duration is 1/m, and M_{M}0 whose duration is 3/m generations. In figure 1<N<1/m, a very wide range of population sizes is presented. N increases from left to right where monomorphisms are not possible. Figure N>1/m shows that the variance of fluctuations around the expected equilibrium frequency (0.25) decreases as N increases. The variance of fluctuations is arbitrary. A precise study is needed to determine the exact variance of fluctuations according to N and mutation rates.

DISCUSSION

The most important feature of actual genomes is not their polymorphic state, but rather their monomorphic state. DNA segments, as well as bacteria, have been maintained mostly invariant for more than one hundred million or even one billion years (Woese, 1987). A theory of evolution should also explain the cause of the maintenance of the invariant zones without assuming, but rather proving, their specific functions or constraints. Because this article has dealt with neutral evolution, only random mutation and drift have been considered here. However, it is evident from the figures, and because invariant monomorphisms is the rule, that `positive' selection has had a major role in producing and maintaining those monomorphisms. In a few hundred generations a site can go from the 0.25 P_{M} equilibrium frequency to M_{M}1. If a base has a fitness of 1.01 (a `nearly neutral and positively selected' base) and the other bases have a fitness of 0.99 (`nearly neutral and negatively selected' bases), 500 generations (6 months) are needed to increase the base frequency from 0.25 to 0.99986 [0.25x1.01^{500}/(0.25x 1.01^{500}+0.75x0.99^{500})]. A study of the equilibrium points for mutation, drift and selection is beyond the scope of this article. A small ecological change can produce a base catastrophic change in a population by positive selection in few generations and maintain the base nearly fixated (close to M_{M}1) until the present. To assume that genomes have functional or constrained and neutral zones does not help us know whether constraints, functions or neutral zones were acquired by selection or drift. Nor does this inform us of the proportion of different zones in genomes. It is evident from this analysis that with regard to bacteria, any time we find a fixated base or allele that has been maintained for more than a million years, it should be attributed to selection and not to drift.

Correspondence: Carlos Y. Valenzuela. Programa de Genética Humana, Instituto de Ciencias Biomédicas (ICBM), Facultad de Medicina, Universidad de Chile. Independencia 1027, Casilla 70061. Santiago, Chile. Phone: (56-2) 6786456. FAX: (56-2) 7373158; e-mail: cvalenzu@machi.med.uchile.cl

Received: July 28, 2000. Accepted: October 16, 2000

REFERENCES

BERNARDI G, HUGHES S, MOUCHIROUD D (997) The major compositional Transitions in the vertebrate genome. J Mol Evol 44, Suppl 1: 44-51 [ Links ]

CARELS N, HATEY P, JABBARI K, BERNARDI G (1998) Compositional properties of homologous coding sequences from plants. J Mol Evol 46: 45-53 [ Links ]

COX DR, MILLER HD (1965) The theory of stochastic processes. London: Methuen. [ Links ]

CROW JF, KIMURA M (1970) An introduction to population Genetics theory. New York: Harper & Row Publishers [ Links ]

FELLER W (1968) An introduction to probability theory and its applications. New York: John Willey & Sons [ Links ]

FUTUYMA DJ (1998) Evolutionary Biology. 3rd ed. Sunderland, MA: Sinauer Associates, Inc. [ Links ]

KARLIN S, MRAZEK J, CAMPBELL AM (1997) Compositional biases of bacterial genomes and evolutionary implications. J Bacteriol 179: 3899-3913 [ Links ]

KARLIN S, MRAZEK J (1997) Compositional differences within and between eukaryotic genomes. Proc Natl Acad Sci 94: 10227-10232 [ Links ]

KIMURA M (1957) Some problems of stochastic processes in Genetics. Ann Math Stat 28: 882-901 [ Links ]

KIMURA M (1968) Evolutionary rate at the molecular level. Nature 217: 624-626 [ Links ]

KIMURA M (1970) Stochastic processes in population genetics, with special reference to distribution of gene frequencies and probability of gene fixation. In: Mathematical Topics in Population Genetics, EN0ICHI KOJIMA (ed), Biomathematics Vol 1. Berlin: Springer Verlag [ Links ]

KIMURA M (1983) The neutral theory of molecular evolution. Cambridge, UK: Cambridge University Press [ Links ]

KIMURA M (1991) Recent development of the neutral theory viewed from the Wrightian tradition of theoretical population genetics. Proc Natl Acad Sci 88: 5969-5973 [ Links ]

KIMURA M, OHTA T (1969a) The average number of generations until fixation of a mutant gene in a finite population. Genetics 61: 763-771 [ Links ]

KIMURA M, OHTA T (1969b) The average number of generations until extinction of an individual mutant gene in a finite population. Genetics 63: 701-709 [ Links ]

KING JL, JUKES TH (1969) Non-Darwinian evolution. Science 164: 788-164 [ Links ]

LI WH (1997). Molecular evolution. Sunderland, MA: Sinauer Associates, Inc. [ Links ]

LI WH, CHUNG CI, LUO CC (1984) Non-randomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. J Mol Evol 21: 58-71 [ Links ]

NEI M (1987) Molecular evolutionary genetics. New York: Columbia University Press [ Links ]

STRICKBERGER MW (1996) Evolution. 2nd ed. Boston, MA: Jones and Bartlett Publishers [ Links ]

VALENZUELA CY (1994) Epistemic restrictions in population biology. Biol Res 27: 85-90 [ Links ]

VALENZUELA CY (1997) Non-random DNA evolution. Biol Res 30: 117-123 [ Links ]

VALENZUELA CY, SANTOS JL (1996) A model of complete random molecular evolution by recurrent mutation. Biol. Res. 29: 203-212 [ Links ]

WOESE CR (1987) Bacterial evolution. Microbiol Rev 51: 221-271 [ Links ]