As part of a series of NIH-wide initiatives to enhance rigor and reproducibility in research, we recently launched a Web page that will serve as a clearinghouse for NIH and NIH-funded training modules to enhance data reproducibility. Among other things, the site will house the products of grants we’ll be making over the next few months for training module development, piloting and dissemination.
Currently, the page hosts a series of four training modules developed by the NIH Office of the Director. These modules, which are being incorporated into NIH intramural program training activities, cover some of the important factors that contribute to rigor and reproducibility in the research endeavor, including blinding, selection of exclusion criteria and awareness of bias. The videos and accompanying discussion materials are not meant to provide specific instructions on how to conduct reproducible research, but rather to stimulate conversations among trainees as well as between trainees and their mentors. Graduate students, postdoctoral fellows and early stage investigators are the primary audiences for the training modules.
Also included on the page are links to previously recorded reproducibility workshops held here at NIH that detail the potentials and pitfalls of cutting-edge technologies in cell and structural biology.
Training is an important element of the NIGMS mission and a major focus of NIH’s overall efforts to enhance data reproducibility. In addition to the training modules we’ll be funding, we recently announced the availability of administrative supplements to our T32 training grants to support the development and implementation of curricular activities in this arena.
I hope you find the resources on this site useful, both now and as we add more in the future.
Einstein may not have actually said, “The definition of insanity is doing the same thing over and over again, but expecting different results,” but he would have made a good point. Reproducibility means nothing other than you are getting the expected result of repetition. Maybe it is because of a background in engineering and physical science, meaning being trained to make good measurements more than to spew hypotheses, but it seems obvious that any carefully performed, well-designed experiment should be able to reproduce an artifact with high statistical significance. Thus, much of what is published may be both statistically significant and completely false.
The increasing quest for the magical P value of 0.05 as if it were some kind of proxy for validity is one of the saddest trends in experimental science. I would remind our colleagues that the observational and clinical sciences, where experimentation is essentially impractical, are where one’s only compelling argument may be from the statistics. In that world, one may have no good way to actually check if your inference is correct, so you get your P value and report that you might be right, maybe, hopefully. The rest of us can do a pilot, see the answer, skip the statistics and just go right to changing experimental conditions to directly test whether our previous observation predicts the outcome of the new, improved and independent (!) experiment. Then, try to nullify again, and again. No P value from a previous experiment will overcome a solid, direct test that kills your hypothesis. If you are lucky, you can report that the hypothesis survived a series of independent challenges and was not nullified. Whether it is actually true, who knows. On the other hand, the faster you discover you are wrong, the sooner you can go on to observing something that might be right!
For a very interesting and illuminating discussion of P values and some of the other issues Steve raises, see: R. Nuzo (2014) Statistical Errors, Nature 506: 150-152 (http://www.nature.com/news/scientific-method-statistical-errors-1.14700)
It is worth noting that the general term “reproducibility” conflates many different concerns, including the reproducibility of the data themselves and the strength and generalizability of the conclusions drawn from them. We anticipate that the training modules and other material posted on the clearinghouse site will address a range of problems from the technical (e.g., reagent validation) to the operational (e.g., experimental design) to the sociological (e.g., how the scientific rewards and evaluation systems influence research). We hope that they will be useful for scientists at all career stages.
I would love to see some concrete recommendations from NIH where it comes to Key Biological Resources. Unlike in physics or chemistry, work with antibodies and cell lines is more of an art. The manufacturers do not give the concentration of the product in their material data sheets, and even if they did, it might not matter because the binding of the antibody to the protein varies by so many orders of magnitude that it may make concentration of reagent relatively irrelevant. The problem here is huge, $1B per year is the estimated waste on just reagents (see Bradbury)! http://www.nature.com/news/reproducibility-standardize-antibodies-used-in-research-1.16827
Cell lines are alive and HeLa cells have been shown to take over every other cell line, eventually, so all you have to do is have that undergrad stomp through the lab once and suddenly you no longer know what you are working on!
http://www.nature.com/news/announcement-time-to-tackle-cells-mistaken-identity-1.17316
The third problem, and one that is addressable easily, is that our reporting of these resources which most often looks like this: NeuN monoclonal antibody from Sigma (in some city), but looking at the Sigma online catalog shows that there are many antibody products that match this description. The answer to which antibody / mouse / cell line did you buy should not need to involve the author of the paper pulling out his or her notes! http://www.nature.com/news/researchers-argue-for-standard-format-to-cite-lab-resources-1.17652
All published papers should contain RRIDs, research resource identifiers, which track research resources and can be used to aggregate papers based on the resource used.
We’re very aware of these challenges, and we and others in the scientific community are continuing to work on them in multiple ways. Since I wrote this post, there have been a number of developments, including:
* An NIH-sponsored workshop on reproducibility in cell culture studies (Videocast Day 1 | Day 2).
* A new funding opportunity announcement to support the development of novel and cost-effective tools for cell line authentication.
* A series of blog posts from NIH Deputy Director for Extramural Research Mike Lauer: Updates on Addressing Rigor in Your NIH Applications; Scientific Premise in NIH Grant Applications, Scientific Rigor in NIH Grant Applications, Consideration of Relevant Biological Variables in NIH Grant Applications, and Authentication of Key Biological and/or Chemical Resources in NIH Grant Applications (see also related FAQs).
* Awards to graduate training programs to support the development of experimental design and other curricula to enhance reproducibility.