UPDATE 12th August 2013: The paper underpinning this blog has just been published proper. Here is the pdf if you are interested:
TRANSFER FROM POSTEROUS: ORIGINALLY POSTED ON POSTEROUS MAY 10, 2012
IMPORTANT NOTICE: This blog relates to two academic papers published today. One is a paper on which I am the lead author. These comments are entirely my own and do not necessarily reflect the thoughts and opinions of my wonderful co-authors or the University of Nottingham.
There is a fascinating emerging phenomenon in the field of science: that science might be prone to systematic error. At least, there seems to be more attention to this of late. Scientists have always been aware of the potential for error in their methods. However, today is special in relation to this subject (for me anyway) because first, there is an enlightening World View article published in Nature on this matter by Danial Sarewitz (D.Sarawitz Nature 485, 149), and second, a small team from the University of Nottingham have had published a research paper on this very subject (R. Kerry et al J Eval Clin Prac) (pdf above).
Sarewitz nudges towards a dimension of this phenomenon which is of utmost interest to me and my field of science, health science. That is that the systematic error observed in scientific method seems to be revealed only, or at least best, when the science is placed in the context of the real-world. In health science we work in a framework known as evidence-based practice (EBP), and this is a living, object example of what Sarewitz is referring to. EBP is solely concerned with the integration of scientific findings from rigorous research processes into the shop-floor decision-making of health care professionals. So is scientific error witnessed in EBP? If so, how does that effect patient care? These are big questions, but here are some thoughts on how their answers might be informed.
First, what does the state of health science look like with regard to this error. John Ioannidis mid-noughties high-profile reports on the phenomena e.g. ‘Why Most Published Research Findings are False’ (J.P.A. Ioannidid PLoSMed. 2, el124;2005) caused turbulence in the field of health science, mostly medicine. He provided evidence of systematic bias in ‘gold-standard’ scientific practices. Our paper published today supports these findings: gold-standard research methods are not reliable truthmakers. But this is only the case when ‘truth’ is defined as something outside of the methodological frameworks. We tried to find a definition which was as real-world as possible, yet as tightly related to the scientific methods as possible, i.e. conclusions from systematic reviews or clinical guidelines. Right up to this point, the science looks good: tight control for bias, well-powered, apparently externally valid. However, the moment you step-out of the science itself, things look very different. We found that in-fact there was no reliably meaningful indication of truthfulness of a single controlled trial by its internal markers of bias control. So although a trial looks great for internal validity, this quality does not translate to the out-side world. We are not the first to question the value of markers of bias control for real-world applicability.
Sarewitz states: “Researchers seek to reduce bias through tightly controlled experimental investigations. In doing so, however, they are moving farther away from the real world complexity in which scientific results must be applied to solve problems”. Voilà. The paradox is clear: the tighter trials get for controlling for bias, the less relevance they have to real-world decision making. Sarewitz also suggested that if biases were random, multiple studies ought to converge on truth. Our findings showed that in the trials examined, throughout time (and given that more recent trials tended to be the higher quality ones), study outcomes tended to diverge from the truth. So, the most recent and highest quality trials were the worst predictors of truth.
There are strong scientific, professional, educational and political drivers surrounding this issue: funders base their decisions of proposals that show greatest rigor; health scientists get better at constructing trials which are more likely to establish causal relationships (i.e. control better for bias); journals insist on trial adherence to standards of bias control; scientific panels for conferences examine abstracts for bias control; students are taught about evidential hierarchies and why highly controlled studies sit at the peak of health science; health care commissioners seek to make purchasing decisions on sight of the highest-quality evidence.
However, all may not seem so bleak. There are a couple of movements which initially appear to offer some light. First, the attempts by researchers in making their trials more representative of the real-world. For example, the interventions under investigation being more relevant to common practice and mechanistic principles; the trial sample being more representative of the population; outcomes being more meaningful. Second, Universities and funders are becoming more concerned with ‘knowledge transfer’ methods. The idea being to seek ways to get the great internally valid research findings into real-world practice. It is clear that these two strategies are missing the point. More representative trials are still going to be confined to internal constraints optimising internal validity. If not, their defining characteristic – to establish causal relationships – will be diminished. It seems a poor situation – you’re damned if you do, you’re damned if you don’t. Knowledge transfer strategies are at risk of further exaggerating the asymmetry between research findings and real-world practice, in-fact. “Let’s just push our findings harder onto society”.
There is no quick, easy solution. However, Sarewitz alludes to the key for potential advancement with regard to this phenomena: real-world complexity. This won’t go away. Until the real-world is ready and able to absorb abstract knowledge, it is unlikely that simply improving the internal quality of science will make any difference. In fact, it could serve to harm. The relationship between science and society needs to become more symmetrical. Who knows how this can happen? Examples as cases in point might be that the nature of ‘bias’ needs to be re-considered. Or our notion of what is understood by causation needs investigating. The causal processes in the real-world might be very different to those observed in a trial, despite the “external validity” of a trial. Even if the real-world could be captured in a trial, the moment the trial finishes, that world might have changed.
Today is a great day for science, and a great day for the real-world – as is every day. Let’s keep science real.
Posterous view: 470
Thanks for your comment. OK, so I am using the word “truth” to mean “scientific truth”, i.e. the outcome of scientific investigations. Of course this is variable over space-time, but I am premising this on the broad notion that the core activity of science is to reach some sort of consensus. E.g. physics edges towards an understanding of the origins of the universe. What I mean by “diverge” is that looking BACKWARDS from what we know today as the “truth”, it seems like there is no pattern to how scientific studies relate to this. I think if we were looking FORWARD we can talk in terms of “agreeing”, e.g. multiple studies seem to be agreeing with each other, the outcome of this agreement could be called scientific truth. If this purpose of science is accepted, then when looking BACKWARDS you would expect a clear pattern whereby studies, particularly the best quality studies, converged towards the “truth”. So, how did we get to this “truth” which doesn’t relate to trial outcomes? We defined it in numerous ways, e.g. systematic review outcomes, clinical guidelines, totality of epidemiological + mechanistic evidence. What was clear is that no matter how you define “truth”, a “scientific” progression / convergence cannot be identified in RCT findings. i.e. RCT outcomes are random, statistical accidents. Assuming our method is valid, I see a number of explanations / ways out of the problem: 1) you simply can’t/shouldn’t attempt to define “truth”: you just roll with study outcomes day-by-day and don’t worry about it; 2) it is not the purpose of science to edge towards truth; 3) truth is constructed, not discovered 4) health research is not a scientific activity; 5) the purpose of health research is justification, not discovery. I think you would have trouble excepting 1) and 2). Researchers won’t like 3) and 4), which leaves 5). I think this is the best position for health research to hold. However if this is the case then RESEARCH SHOULD NOT BE USED TO INFORM PRACTICE OR PREDICT OUTCOMES. It should be purely a way of providing evidence for something that happened (in the past), like a certificate is evidence of attending a CPD course, but it does not predict future attendance. This would appease commissioners / our own agenda etc, but turns EBP into rhetoric, and not a science-based activity. Of course I agree that there are many truths, bUt I have focussed on an interpretation of scientific truth here.Apologies for rambling! Hope all is well