---+ Mutation Rates from Genome Resequencing *Motivation:* You have re-sequenced several genomes after a mutation accumulation or adaptive evolution experiment. How do you infer the rates of different types of mutations from these data? What are the 95% confidence intervals on these values? ---++ Case 1: Mutations with many identical sites *Assumptions:* 1 The number of mutations is small compared to the number of sites. 1 There are no back mutations (reversions). 1 Mutations rates are constant over time and across sites. *Example:* Single-base substitutions *Calculation:* 1 If you restrict your data to one genome per experimental population, then you can calculate the maximum likelihood value and 95% confidence limits from a Poisson distribution. Count the total number mutation (_m_) and the total elapsed generations or time of independent evolution (_T_). Example: 22 point mutations found in 6 genomes that each evolved for 10,000 generations. %BR%<verbatim>>m = 22 >T = 10000 * 6 >rate = poisson.test(m) >rate$estimate/T event rate 0.0003666667 >rate$conf.int/T [1] 0.0002297880 0.0005551377 attr(,"conf.level") [1] 0.95 </verbatim> 1 If you know the number of sites at risk for the mutation (_s_), then you can calculate a per-site mutation rate. Example: Assume these 22 point mutations are A to G substitutions and there are 1,342,726 A bases in the original genome. %BR%<verbatim>>s = 1342726 >rate$estimate/(T*s) event rate 2.730763e-10 >rate$conf.int/(T*s) [1] 1.711355e-10 4.134408e-10 attr(,"conf.level") [1] 0.95 </verbatim> ---++ Case 2: One-time mutations *Assumptions:* 1 The mutation can only happen once per genome. 1 The mutation rate is constant per unit time or generation *Example:* Deletion of an unstable chromosomal region. Once deleted, it can never be deleted again. *Calculation:* 1 Count the number of independent genomes that have the mutation (_m_) and total number of genomes analyzed (_n_) at a given time (_T_). Example: 5 of 12 independently evolved genomes have the mutation after 10,000 generations. %BR% <verbatim>> m = 5 > n = 12 > T = 10000 </verbatim> 1 Calculate a maximum likelihood value and 95% exact (Clopper-Pearson) confidence limit for the fraction of independently evolved lineages that __do not have__ the mutation from your observations. %BR% <verbatim>p = binom.test(n - m, n) >p Exact binomial test data: n - m and n number of successes = 7, number of trials = 12, p-value = 0.7744 alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.2766697 0.8483478 sample estimates: probability of success 0.5833333 </verbatim> 1 If the mutations happen at a constant rate per unit time, then you can calculate the rate that gives this fraction of independent lineages without a mutation up to the given time point using the zero event term from a [[http://en.wikipedia.org/wiki/Poisson_process][Poisson process]]: %BR% <verbatim>> -log(p$estimate) / T probability of success 5.389965e-05 > -log(p$conf.int) / T [1] 1.284931e-04 1.644646e-05 attr(,"conf.level") [1] 0.95 </verbatim> This is a particularly simple type of [[http://en.wikipedia.org/wiki/Survival_analysis][survival analysis]]. ---++ Issues: Pseudo-replication ---++ Issues: Different mutation rates in different lineages
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
More topic actions...
Barrick Lab
>
ProtocolList
>
ProceduresCalculatingMutationRatesFromGenomicData
Contributors to this topic
JeffreyBarrick
Topic revision: r3 - 2012-07-25 - 15:09:39 - Main.JeffreyBarrick
Barrick Lab
Contact
Research
Publications
Team
Protocols
Reference
Software
UT Austin
Mol Biosciences
ILS
Microbiology
EEB
CSSB
CBRS
The LTEE
iGEM team
SynBioCyc
SynBio course
NGS course
BEACON
Search
Log in
Copyright ©2025 Barrick Lab contributing authors. Ideas, requests, problems?
Send feedback