F. Health Policy Methods


Topic Outline


Hierarchy of Evidence

  • Public Health Law ResearchHierarchy of EvidenceIn general, resources higher up the pyramid are less susceptible to bias and therefore provide more robust evidence about the effects of public health laws. Experimental designs, for example, utilize randomization and double-blinding to reduce selection and measurement biases making them more powerful tools for understanding causal relationships than quasi-experimental and observational designs. At the top of our pyramid are studies that use systematic processes such as meta-analysis to assess a question in light of a body of primary studies that have examined it. At the bottom of our pyramid are foundational resources like legal datasets and papers setting out research agendas. The bulk of our resources are primary studies in the middle two levels.While this hierarchy reflects judgments about the authority of various designs, it does not suggest that research employing a design from a higher level is always more scientifically authoritative than research conducted in a design from a lower level.

Reproducibility of Evidence

Dr. Marcia Angell, the editor in chief of the New England Journal of Medicine, wrote in 2009: “It is simply no longer possible to believe much of the clinical research that is published, or to rely on the judgment of trusted physicians or authoritative medical guidelines. I take no pleasure in this conclusion, which I reached slowly and reluctantly over my two decades as an editor of The New England Journal of Medicine.”

Similarly, the editor in chief of Lancet, Richard Horton, wrote in 2015: “Much of the scientific literature, perhaps half, may simply be untrue. Afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance, science has taken a turn towards darkness.”

  • Ioannidis JPA (2005) Why Most Published Research Findings Are False. PLoS Med 2(8),August 30, 2005: e124. There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.
  • Open Science Framework, Reproducibility Project: Psychology (2015). The Reproducibility Project: Psychology was a collaborative, crowdsourced effort of 270 authors and 86 additional volunteers. Across multiple criteria, we successfully reproduced fewer than half of the 100 original findings investigated. Multiple analysis methods were used, examining p-values, effect sizes, a meta-analysis combining original and replication effects, and subjective assessments to evaluate the success of each replication. Stronger original effects were correlated with stronger replication effects. Other correlates of reproducibility were investigated, examining possible influences on replication success. This study was reported in Science 8.28.15.According to Nature report, “John Ioannidis, an epidemiologist at Stanford University in California, says that the true replication-failure rate could exceed 80%, even higher than Nosek’s study suggests. This is because the Reproducibility Project targeted work in highly respected journals, the original scientists worked closely with the replicators, and replicating teams generally opted for papers employing relatively easy methods — all things that should have made replication easier.”
  • Ziliak, Steven and Deirdre McCloskeyThe Cult of Statistical Significance. The authors argue that a sizable body of economic and medical literature privilege the narrow question of statistical significance over the far more important question of effect size. As a consequence, important findings are ignored in favor of findings that happen to be precise (even if the latter are inconsequential in terms of impact). A lengthier book-long treatment of this topic is The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives.

Leave a Reply

Your email address will not be published. Required fields are marked *