Introduction to Experimental Design


Whether for just a summer or for the duration of an entire Ph.D project, working on a scientific problem is a process of trying to explain and predict the way that the natural world operates. You will typically need to use or develop new ways of measuring something about the world or perform quantitative experiments to test a hypothesis. Because the amount of time that you have to reach an answer is finite, the whole process is a bit like a game of twenty questions. There are only so many experiments that you can do before the end of summer or before it's time to graduate. Use these attempts as wisely as if they were a limited number of wishes you have been granted. The more effectively that you learn how to pose the questions (and debug protocols) the more progress you will make toward scientific discovery. This page is meant to provide vocabulary, ideas, and questions related to experimental design.

The essential starting point in all this is a very clear idea about the question to be answered. Without this initial investment of time and thought, it is easy to carry out an experiment which cannot answer the question either because the practical work itself has a flaw or because the results cannot be statistically analysed. This is perhaps the hardest lesson to learn.

–David Heath, An Introduction to Experimental Design and Statistics for Biology, 1995

Types of Experiments

  1. Observational Experiments
    • Description – "What is found?"
    • Appropriate when you have no prior information about the realm of possible outcomes.
      • Example: microbial evolution experiments, in vitro selection.
    • Want to record as much descriptive information as possible, because important factors are unknown.
    • Limitation: Cannot prove that a factor correlated with a given outcome causes that outcome!
  2. Manipulative Experiments
    • Explanation – "Why and how is it found?"
    • Appropriate when you are trying to determine the factors which cause a certain outcome.
      • Example: reconstructing strains with combinations of known evolved alleles to determine which ones cause a phenotype
    • Based on prior knowledge, you can enumerate or mathematically model the possible outcomes.
    • Research hypothesis generation - Brainstorm – Make a list of as many explanations as you can.
    • Consist of different manipulative treatments that contrast the factor being tested for causality.
      • Treatments that correspond to baseline, known, natural states of the system labelled controls.
    • Limitations: May oversimplify the problem when there are interactions between factors and must be conducted under unnatural, altered, or laboratory conditions that make extrapolating to nature difficult.

Statistical Design of Experiments

Before conducting an experiment, you should already know what you are going to measure to be sure that you will be able to detect the salient differences. A sample is the subset of a population that is examined. Generally one uses measurements of the sample to infer the properties of the population in an experiment. For example, the red/white colonies plated on agar to measure the fraction of a population that are Ara– and Ara+ are a sample of a few hundred cells from a population of millions of cells.

There will always be variation in measurements due to:

  • Experimental error - unavoidable imprecision in measurements of the same sample.
  • Biological variation - differences between individuals (cultures, strains) making up the sample which are nominally the same.
  • Variation in space and time - variation in measurements due to unplanned changes in environmental conditions (Ex: performing measurements on different days). Can be reduced by making measurements to be compared at the same time and with a randomized design in physical space.
  • Sampling error - uncertainty in making inferences about the population because of chance variation in the makeup of the sample that was actually tested. Can be reduced by increasing sample size.

Calculate the observations that you expect under the null hypothesis (H0) that the factor is not important. Compare this probability to the alternative hypothesis (H1) that the factor is explanatory. If the probability under the alternative hypothesis is greater (by a statistically significant margin) then reject the null hypothesis.

Before starting an experiment you should perform a statistical power analysis to be sure that you will be able to detect significant differences with your experimental error, number of replicates, etc.

To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.

–Ronald Fisher, Presidential Address to the First Indian Statistical Congress, 1938.

Other important considerations:

  • Keep the statistical analysis as simple as possible. The more complex your analysis, the more difficult it will be to explain to others exactly what you did, the more likely you might make mistakes in calculations, and the more likely you are to misinterpret the results.
  • Test one factor at a time. If possible change only one variable between treatments at a time.
  • Beware of pseudo-replication. Be sure that your replicates are entirely statistically independent. Three measurements of the same sample have a very different type and amount of error than three measurements of different samples.
  • Beware of unintended correlations in time. Measure everything that you want to compare at the same time (in one experimental block). If this is not possible (for example, if some assays fail and you want to redo those), you should include internal controls that were measured both times in order to detect and account for systematic biases between the two experimental blocks of measurements.
  • Fully randomize your design. To account for variation in time and space, use a random number generator to make the placement and order of samples random with respect to your treatments. If this is too cumbersome, alternate systematically between samples in the different treatments. Examples of issues include, putting all flasks for one treatment on one side of the incubator or processing them first before other samples.

Making Measurements

  • Is my assay working?
    • Sanity check: Are the values I'm getting realistic? Do they agree with what is known from the literature?
    • Positive control: Do you have a strain or sample that should give an expected result in the assay? Can you achieve the expected result?
  • How reliably is my assay working?
    • Does it give the same values from day to day?
  • Am I measuring what I think I'm measuring?

Common Tricks

  • If there is no way to manage a pure +/- treatment for the presence of some compound or factor, try spiking in a known, additional amount of what you are testing instead.

Further Information


  1. Heath, D. (1995) An Introduction to Experimental Design and Statistics for Biology. UCL Press, London.
  2. Montgomery, D.C. (2001) Design and Analysis of Experiments. John Wiley & Sons, New York.
  3. Hurlbert, S.H. (1984) Pseudoreplication and the design of ecological field experiments. Ecol. Monogr. 54:187–211.
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | More topic actions

 Barrick Lab  >  ProtocolList  >  IntroductionToExperimentalDesign

Contributors to this topic Edit topic JeffreyBarrick
Topic revision: r8 - 2018-10-26 - 01:12:31 - Main.JeffreyBarrick
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright ©2024 Barrick Lab contributing authors. Ideas, requests, problems? Send feedback