Mutation Rates from Genome Resequencing

Motivation: You have re-sequenced several genomes after a mutation accumulation or adaptive evolution experiment. How do you infer the rates of different types of mutation rates from these data? What are the 95% confidence intervals on these values?

Case 1: Single-base substitutions

Assumptions: The number of mutations is small compared to the number of sites.

If you restrict your data to one genome per experimental population, then you can calculate the 95% confidence limits by assuming this is a Poisson process (poisson.test in R).

If you take multiple genomes from one experimental population, this is a type of pseudo-replication (they may have a shared evolutionary history). This makes calculating the 95% confidence intervals more complicated.

Case 2: One-time mutations

Assumptions: A mutation can only happen once per genome.

Example: Deletion of a chromosomal region. Once deleted, it can never be deleted again.

This is a type of "survival analysis". You can calculate the fraction of genomes that have and do not have your mutation. Then consider this a binomial process, to calculate a 95% confidence interval Then, convert this to a per-generation rate by dividing by the number of mutations.

Barrick Lab > ProtocolList > ProceduresCalculatingMutationRatesFromGenomicData

Contributors to this topic

JeffreyBarrick

Topic revision: r1 - 2012-03-12 - 22:13:23 - Main.JeffreyBarrick