Mutation Rates from Genome Resequencing
Motivation: You have re-sequenced several genomes after a mutation accumulation or adaptive evolution experiment. How do you infer the rates of different types of mutation rates from these data? What are the 95% confidence intervals on these values?
Case 1: Single-base substitutions
Assumptions: The number of mutations is small compared to the number of sites.
If you restrict your data to one genome per experimental population, then you can calculate the 95% confidence limits by assuming this is a Poisson process (poisson.test in R).
If you take multiple genomes from one experimental population, this is a type of pseudo-replication (they may have a shared evolutionary history). This makes calculating the 95% confidence intervals more complicated.
Case 2: One-time mutations
Assumptions: A mutation can only happen once per genome.
Example: Deletion of a chromosomal region. Once deleted, it can never be deleted again.
This is a type of "survival analysis". You can calculate the fraction of genomes that have and do not have your mutation. Then consider this a binomial process, to calculate a 95% confidence interval Then, convert this to a per-generation rate by dividing by the number of mutations.
Topic revision: r1 - 2012-03-12 - 22:13:23 - Main.JeffreyBarrick