Tag Archives: University of Nottingham

I Don’t Get Paid Enough To Think This Hard

For well over a decade, I have been teaching healthcare professionals, mainly physiotherapists, about stuff. Although wrapped up in many guises, this “stuff” has essentially been thinking. Thinking in healthcare professions is packaged up as clinical reasoning.  I’ve always thought this to be a good thing: that we work out possible diagnostic hypotheses with our patients, use the best of our knowledge, experience and evidence to test those hypotheses, and judge from a variety of evidence sources the best treatment options. The alternative is either blindly following set guidelines, or making random decisions.

I really enjoy teaching this stuff.  I love working with students to get the best out of their brains, and see their thought processes and their clinical practice develop.  I love the literature on this stuff, and have indeed often published about it myself. I have a pop-art poster of Mark Jones in my bedroom (Fig 1).


Fig 1: My pop-art poster of Mark Jones, Clinical Reasoning guru.

I ran “Clinical Reasoning” modules at my place of work for undergraduate and postgraduates for years.  I have helped develop reasoning tools. I guess I think it’s fairly important to what we do as clinicians.

However, a few years ago whilst teaching on a course, halfway through a case study exercise, one of the delegates turned and said “I don’t get paid enough to think this hard”.  At the time, and for several years since, this struck me as astonishing – in a negative way. What? This is part of your job! This is how you can strive to get the best out of your patients; it’s demanded by your regulator; it’s a necessary condition of clinical practice; blah blah blah. But recently it struck me that he might have a point.

What is our price, and does this reflect the measures we go to to achieve our end? What absolute difference does it make investing the time, energy, resources necessary for “advanced thinking” to clinical outcomes? (we don’t know). Could we drift through our careers following guidelines and making random decisions, and still do OK for our patients? (maybe). How does our price compare with other “thinking” professions, Law, for example? (poorly). What is the impact of all this stuff on our emotional, social, and psychological, and physical status? (significant) How has doing this stuff changed in an era of evidence-based practice? (dramatically).

On the last point there, clinical reasoning may once have been a process of applying a line of logic to a patient contact episode: “they said they twisted this way, it hurts there, therefore this is the problem so I’ll wiggle this about for a bit”. Clinical reasoning is becoming more-and-more synonymous with evidence-based practice (EBP), and EBP looks very different to the above scenario. EBP is about integrating the best of the published evidence with our experiences and patient values. How do you do that!? Well, this is the stuff that I try and teach, and this may have been the tipping-point for our friend’s critical statement.

Consider the state of thinking in the modern healthcare world: First, the published evidence. There are at least 64 monthly peer-reviewed journals relevant to the average rehabilitation physiotherapist (that’s excluding relevant medical journals, in which there is a growing amount of physio-relevant data). These have an average of around 30 research papers each, each paper being around 8 detailed pages. That’s 15,360 pages of ‘evidence’ per month, or 768 per working day. Some, of course, won’t be relevant, but whichever way you look at it, this is an unmanageable amount of data to integrate into everyday clinical decision making. Many of these papers are reviewed and critiqued, so the clinician should be aware of these too. Many of these critiques are themselves critiqued, and this level of thinking and analysis would also be really useful in understanding the relationship between data and clinical decision-making. EBP does have tools to help with data-driven decision making. These require the clinician to have a continually evolving understanding of absolute and relative risk, the nuances of the idea of probability (don’t even get me started on that one), a Vorderman-esque mind – or at least the latest app to do the math for you, and time.

Arrhh, time. The average physiotherapist will, say, work an 8 hour day, seeing a patient on average every half-an-hour or so. That half-hour involves taking important information from the patient and undertaking the best physical tests (which are..?) and treatments (which are…?), then recording all of that (don’t forget the HPCP are on your back young man – a mate of a mate of someone I  know got suspended last week for shoddy note-keeping. How would I pay the mortgage?). So when is that evidence read, synthesised, and applied? No worries, in-service training sessions at lunch-time will help (no lunch or toileting for me then). What about evenings and weekends – yes, lots of thinking does occur here (but what about the wife and kids).  I know there is no training budget for physiotherapists, but you can do some extra on-call or private work to pay for those courses can’t you? (Yes. When?) You get annual leave don’t you? That’s another great opportunity to catch up on your thinking education (Cornwall will wait).

Thinking this hard costs. It costs time, money, energy, opportunity and health. Do we get paid enough to think this hard? Maybe our critical friend had a point. However, the pay isn’t going to change, so the thinking has to. Is this a signal that we are at a stage of development in healthcare when ‘thinking models’ need to be seriously revised in a rapidly evolving, data-driven world? Thinking was, is, and will always be central to optimal patient care, but how we do it needs to be re-analysed. Quickly. Think about it.



Filed under Uncategorized

Argument formation for academic writing


Many students find it difficult to identify what it is that makes a good piece of academic writing. At the core of such writing is the nature and structure of the intellectual argument. Here is some information that we share with our Physiotherapy students at the University of Nottingham to help with their understanding of arguments. I hope you find it useful.

Argument formation

The idea of a basic argument is fairly simple. An argument is formed of ‘premises’ and ‘conclusions’. For a valid argument, in order for the conclusion to be true (which is what you want in an essay, i.e. you don’t want to draw false or unstable conclusions), the premises must be true. So, the classic example is:

Premise 1: All men are mortal

Premise 2: Socrates is a man

Conclusion: Socrates is mortal

Do you see that if P1 and P2 are true, then the conclusion HAS to be true?

So, if it REALLY IS true that all men are mortal, and it REALLY IS true that Socrates is indeed a man, then it HAS TO BE THE CASE that Socrates is mortal. Yes? Do you get that?

Make sure you fully understand this basic principle before reading any further!

OK, so let’s look at another example:

P1: Lucy is a physio

P2: All physios wear white tunics

Conclusion: Lucy wears a white tunic

Get it? Of course you do.

So the two examples above are cases of a good, robust deductive argument – the conclusion is deduced from the premises. We’ll come onto how this looks in an essay in a moment.

Now, here are four types of poor arguments:

Type 1: false premises

This is a simple mistake. Consider the above ‘physio’ example. You would have most likely noticed that the two premises are full of assumptions: 1) that Lucy is in fact a Physio, and 2) that all physios do in fact wear white tunics. The actual truth of the conclusion not only relies on the logical flow, but the accuracy of the detail within that flow. So, an argument can be logically correct – i.e. its logical form is robust, but the factual accuracy of the premises may render it poor.

This is very important in essay writing, and will be address again below.

Type 2: The inductive fallacy (over-generalising)

P1: I have seen 1 white swan

P2: I have seen 2 white swans

P3: I have seen 3 white swans, and so on….

Conclusion: all swans are white

This is a poor argument because it could always be the case that there is a black swan which you haven’t seen. Therefore the generalisation that “all swans are white” is false. So a Physio example:

P1: I have seen ultrasound work on ankle pain once

P2: I have seen ultrasound work on ankle pain twice

P3 ….n: etc etc

Conclusion: Ultrasound works for ankle pain

Type 3: another type of over-generalising – ideas and data

P1: There has been a lot of music in Nottingham lately

P2: Lots of people think that Nottingham is the music capital of Europe

Conclusion: Nottingham is the music capital of Europe

So the conclusion is not necessarily true, even though the premises might be true. Why?  Well, there are two issues:

i) Although the premises might be true, their relationship with each other, and the conclusion, is tenuous. Compare the robustness of the relationship between components in the first Socrates example, with those here.   See how the concepts of ‘mortality’ and ‘Socrates’ are distributed between the premises, linked by the idea of ‘man’.  Notice that ‘man’ does not appear in the conclusion – that idea has already done its job. ‘Socrates’ and ‘mortality’ are the only ideas that re-appear in the conclusion.

In the music example, there is no such pattern. Both ideas of ‘music’ and ‘Nottingham’ appear in both P1 and P2. They are not linked by a central, meaningful idea. P1 and P2 are simply independent commentaries on a similar theme.

Also note that in the Socrates example, both P1 and P2 are necessary conditions for the conclusions, as well as being independently insufficient for it, i.e. they are needed by each other, and by the conclusion. These relationships do not exist in the music example, e.g. that a lot of people thinking that Nottingham is the music capital of Europe is not a necessary condition for Nottingham being the music capital of Europe.

ii) There is missing data! To claim that “Nottingham is the music capital of Europe” relies on something other than what has happened in Nottingham and what people think. It relies on the music rate in other European cities.

MUSIC BREAK: Da da da da da da da da daaaa

As it happens, Nottingham most likely is the music capital of Europe! For example, here’s a great band which comes from Nottingham:


and you can “like” their Facebook page here:



Type 4: alternative explanations

Premise 1: Contraceptive pills prevent unwanted pregnancy.

Premise 2: John takes the contraceptive pill and he isn’t pregnant.

Conclusion: The contraceptive pill prevented John’s unwanted pregnancy.

Here, again, both P1 and P2 may well be true, but the conclusion isn’t true because there is an obvious alternative explanation for why John does not get pregnant – he is a man.

Constructing arguments in essay form

Now, how does all this relate to your academic writing? Simple. This basic line of reasoning is what we look for in your overall writing piece.

Here’s an over-simplified example: Let’s say you set out to write an essay on the effectiveness manual therapy on neck pain. You might structure your argument something like this:

P1:  Manual therapy for neck pain has some RCT-level evidence

P2: RCTs give good evidence of effectiveness

C: Manual therapy is effective for neck pain.

This seems fairly simple right? But let’s break it down:

The conclusion is wholly reliant on the truthfulness of the premises. In other words if P1 or P2 were false, so would be the conclusion. Further, P1 and P2 are both necessary yet individually insufficient conditions for C.  Notice that the ideas of ‘manual therapy’ and ‘effectiveness’ are linked by the idea of ‘RCTs’ in the premises, and the ‘RCT’ does not appear again in the conclusion.

The argument has avoided the induction fallacy of over-generalisation. There is no obvious over-generalisation of the conclusion. So, you could have said:

P1: 10 case studies show that manual therapy is good for neck pain

Conclusion: manual therapy is good for neck pain

This would have fallen into the induction fallacy

The premises and the conclusion are satisfactorily related (unlike the music example), and have this avoided the ‘lack of robustness / missing data’ issues. So, you could have said:

P1: 10 case studies show that manual therapy is good for neck pain

P2: a number of authors state that manual therapy is good for neck pain

Conclusion: manual therapy is good for neck pain

This would have been a mistake, as per the music example. There is missing data, e.g. no consideration of tests of effectiveness.

So we can see how easy it is to develop a valid and robust argument to build your essay around. If you have avoided the common errors in logical form, all you need to do now is to test the truthfulness of the individual premises. This means, in the case given here, you would be discussing the relative quality of different types of manual therapy studies, and trying to show that manual therapy has some RCT-level studies, before drawing your logical conclusion. Once you have those conclusions, you can then go on to discuss the consequences / implications / context etc of them.

Remember two main things:

1)      Make sure you have a VALID LOGICAL STRUCTURE

2)      When you have that, the aim of your essay is to DEMONSTRATE THE TRUTH OF THE PREMISES.

If you show these two simple things, you are half-way there. The other half is how clearly and concisely you can write!

And finally, I recommend to buy “Rulebook for Arguments” by Anthony Weston. You can get it for about £4 of Amazon.

Happy arguing 🙂

1 Comment

Filed under Uncategorized

Clinical Prediction Rules: a little primer

Clinical Prediction

 How do we understand if a patient is likely to:


a)      have the condition/dysfunction we think they have?
b)      respond in a meaningful and positive way to a chosen intervention?
c)       get better within a particular time-period?


Clinical Prediction Rules (CPR)

Algorithmic tools developed to assist clinical decision-making by interpreting data from original studies into probabilistic statistics.


  • Diagnostic CPR (DCPR)


  • Interventional CPR (ICPR)


  • Prognostic CPR (PCPR)


CPR Quality


Grading Scale  IV (poor) to  I (best) (McGinn et al, JAMA,2000;284:79-84)


IV     Derivation only  

 III      Validation in narrow population                                 

II       Validation in broad population                                  

I         Impact Analysis




  • Data derived from primary study (ideally RCTs, but usually prospective cohort studies).
  • Need to clearly define target conditions, reference standards, and potential predictor variables.
  • Need dichotomous outcomes (e.g. “condition present or absent” / “treatment successful of unsuccessful” / “condition persistent or not-persistent”).
  • Study needs to report PROPORTIONS.



  • Separate study with either a narrow of broad population (ideally RCT, ideally block-randomisation).
  • Separate study subjects and therapists to primary study.
  • To confirm that predictor variable effects are not due to chance.


Impact analysis

  • To determine meaningful impact on clinical practice (ideally RCT)
  • Multi-centre
  • Is rule implemented?
  • Does it maintain improved care?


Formal quality assessment

DCPR: no validated formal assessment tool

ICPR: 18-item tool.  Beneciuk et al, Phys Ther,2009; 89:10-11

PCPR: 18-item tool. Kuijpers et al, Pain, 2005;109:429-430


Statistics of interest

The whole world can be represented by a “2 x 2” contingency table!

  Ref Standard P/Outcome P Ref Standard N/Outcome N  
Test P / Control Group a                 TP b                 FP a+b  TP+FP
Test N / Rx Group c                 FN d                 TN c+d  FN+TN
  a+c       TP + FN b+d         TN+FP  




Diagnosis / Intervention Intervention only
SENSITIVITY (“TP rate”) = a/(a+c)    SnNOut

SPECIFICITY (“TN rate”)= d/(b+d)     SpPIn

LIKELIHOOD RATIO +  =  sensitivity/(1-specificity)

LIKELIHOOD RATIO –  =  (1-sensitivity)/specificity




Probability shifts:



1 – 2: small, unimportant

2-5: small but possibly important

5-10: moderate

>10: large, possibly conclusive



0.5 – 1: small, unimportant

0.2 – 0.5: small but possibly important

0.1 – 0.2: moderate

>0.1: large, possibly conclusive


CONTROL EVENT RATE (CER) number of Control Group people with +ve outcome divided by total number of Control Group people. In other words: i.e.: a/(a+b)


EXPERIMENTAL EVENT RATE (EER) = same as above for Rx Group c/(c+d)


RELATIVE RISK, or RISK RATIO (RR): RR = EER/CER (a RR of 1 means there is no difference between groups; >1 means increased rate of outcome in Rx group, and <1 means less chance of outcome)















(The greater above 1, the better)



EFFECT SIZE = (mean score of group 1) – (mean score of group 2)

SD (of either group, or even pooled data)


Other tests:

Test / Statistic Purpose (outcome statistic)
T-test Difference between 2 groups (p-value)


Chi-Squared  Frequency of observations (p-value)


Receiver Operating Characteristic (ROC) curve Identifies score which maximises TPs and minimises FNs (Youden’s J)


Logistic Regression Identifies predictor cut-off points, and predictor clusters (Beta-values (Expβ))


Recursive Partitioning Repeated sub-group analysis to identify best-fit patients (index of diversity)


Confidence Intervals Describe precision (variance) (%)


Examples of Lower Quadrant CPRs


Diagnosis (Medical Screening – DCPR)


Target Condition:  Deep vein thrombosis (lower limb).


Test:  Wells’ Score (Wells et al, J Intern Med, 1998;243:15-23)


Quality: Level  I (impact analysis: 10 relevant acceptable quality associated studies)


Test details (predictor variables)


  1. Activecancer                                                                                                                1
  2. Paralysis, paresis, or recent plaster immobilisation of the lower extremity            1
  3. Recently bedridden for >3 days and/or major surgery with 4 weeks                         1
  4. Localised tenderness along the distribution of the deep venous system                                1
  5. Thigh and calf swollen                                                                                                                    1
  6. Calf swelling 3cm > asymptomatic side (measured 10cm below tibial tuberosity)                1
  7. Pitting oedema; symptomatic leg only                                                                                    1
  8. Dilated superficial veins (non-varicose) in symptomatic leg only                                 1
  9. Alternative diagnosis as or more likely than DVT                                                                             -2


Test scoring

≤ 0 points            Low Risk                               6% probability of DVT

1 or 2 points       Moderate Risk                  28% probability of DVT

≥ 3                          High risk                               73% probability of DVT


Reference standard(s): Plethysmography and venography


Study parameters


Signs and symptoms for < 60 days



Previous DVT of PE

Renal insufficiency

PE suspected


Anticoagulation treatment for >48 hours

Below-knee amputation

Strong alternative diagnosis



Bottom Line

Best quality CPR (Level I): Recommended for clinical use within confines of study parameters.








Diagnosis (Orthopaedic Diagnosis – DCPR)


Target condition: Lumbar/buttock/leg pain arising from the sacroiliac joint


Test: 6-item predictor cluster based on physical examination responses (Laslett et al, Man Ther, 2005;10:207-218)


Quality: Level IV (derivation only (poor quality), no validation, no impact analysis); no regression analysis, no recursive partitioning.


Test details (predictor variables)


  1. Positive SIJ compression test
  2. Positive SIJ distraction test
  3. Positive femoral shear test
  4. Positive sacral provocation
  5. Positive right Gaenslen’s test
  6. Positive left Gaenslen’s test


Test scoring

3 or more predictor variable present =LR+ 4.3 (95%CI 2.3 – 8.6)


Reference standard(s): fluoroscopy-guided provocative SIJ injection


Study parameters

Mean age 42 (+/- 12.3)

Mean symptom duration (months) 31.8 (=/- 38.8)



Buttock pain +/- lumbar/leg pain

Had imaging

Unsuccessful previous therapeutic interventions



Mid-line or symmetrical pain above L5

Nerve root compression signs

Referred for non-SIJ injection

Too frail for manual therapy

Pain free on day of assessment

Bony obstruction to injection


Bottom Line:

Not validated –  study findings could be due to chance. Small probability shift power of LR+. Not recommended for clinical use.
Interventional (ICPR) 


Target condition: Acute low back pain, manipulation


Test: 5-item predictor cluster (Flynn et al, Spine, 2002;27:2835-2843)


Quality: Level II (broad validation, no impact analysis. 4 associated high quality validation studies)


Test details (Predictor variables)

  1. No pain below knee
  2. Onset ≤ 16 days ago
  3. Lumbar hypomobility
  4. Medial hip rotation > 35deg (either hip)
  5. Fear Avoidance Belief Questionnaire (Work subscale) <19



Test Scoring

4 or more predictor variables present = LR+ 24.4 (95% CI 4.6 – 139.4)


Reference Standard(s) (i.e. definition of success) 

50% or more improvement on modified Oswestry Disability Index


Study parameters

Mean age 37.6 (+/- 10.6)



Lumbosacral physiotherapy diagnosis

Pain +/- numbness lumbar/buttock/lower extremity

Modified ODI score ≥ 30%


Bottom Line

Due to validation and very large probability shift power of LR+, this CPR is recommended for clinical

use within confines of study parameters.


Prognostic (PCPR)


Target condition: LBP, recovery


Test: 3-item predictor cluster (Hancock et al, Eur J Pain, 2009;13:51-55)


Quality: Level III (narrow validation, no impact analysis)


Test details (Predictor variables)

  1. Baseline pain ≤ 7/10
  2. Duration of symptoms  ≤5 days
  3. Number of previous episodes ≤1


Test Scoring

All 3 predictor variables present = 60% chance recovery at 3 weeks, 95% chance recovery at 12



Reference Standard(s) (i.e. definition of success)

7 days of  pain scored at 0-1 /10


Study parameters

Mean age 40.7 (+/- 15.6)



LBP (between 12th rib and buttock crease) +/- leg pain

Seen GP

< 6 weeks duration

Moderate pain and disability (SF-36)



No pain-free period pre-current episode of at least 1 month

Known/suspected serious pathology

Nerve root compression

Taking NSAIDs

Receiving spinal manipulative therapy

Surgery within preceding 6 months

Contraindications to analgesics or manipulation.



Bottom Line

Due to validation, careful application to clinical practice is recommended, strongly within confines of study parameters.





Ref: Glynn PE, Weisbach PC 2011 Clinical Prediction Rules: A Physical Therapy Reference Manual. JBL Publishing

Leave a comment

Filed under Uncategorized

How to turn “Stats” into something useful: Diagnosis and Interventions

How to turn “Stats” into something useful 1: Diagnosis

Understanding diagnostic utility

If you have data for diagnostic utility studies, you can use a 2×2 contingency table to calculate the following information (this should have been reported anyway in the study, but often isn’t):


     Gold standard
Clinical test +
+ aTP bFP




Sensitivity (“TP rate”) = a/(a+c)

Specificity (“TN rate”)= d/(b+d)

Likelihood ratio + = sensitivity/(1-specificity)

Likelihood ratio – = (1-sensitivity)/specificity

You can then use something like a nomogram to calculat post-test probability. You will need to have an estimate of pre-test probability. Ideally, this will be the known prevalence of the condition

Or, get yourself on app on your ‘phone like MedCalc3000 https://itunes.apple.com/us/app/medcalc-3000-ebm-stats/id358054337?mt=8

How to turn “Stats” into something useful 2: Interventions  

If a trial or systematic review is reporting DICHOTOMOUS outcomes, we can bring the “research” findings a little bit closer to clinical decision making… “Do you know HOW MANY subjects responded, and HOW they responded? e.g.  how many people in the TREATMENT group got better/worse, and the same for the control/placebo group?”

NO: then you can’t clinically apply findings. Doh.


YES: then go and do some evidence based decision making! Yipee.




Wow, how do we do that? Like this: 1)   Use a 2×2 table (again)

+ve -ve
Control/Placebo group  a  b
Rx group  c  d




2)   And some simple formulae…

CONTROL EVENT RATE (CER) number of Control Group people with +ve outcome divided by total number of Control Group people. In other words: i.e.: a/(a+b)

EXPERIMENTAL EVENT RATE (EER) = same as above for Rx Group c/(c+d) Now that we know the CER and EER, we can do loads of other useful things…

RELATIVE RISK, or RISK RATIO (RR): RR = EER/CER (a RR of 1 means there is no difference between groups; >1 means increased rate of outcome in Rx group, and <1 means less chance of outcome)



NUMBER NEEDED TO TREAT (NNT): NNT = 1/ARR Some other stats more USEFUL than “p-values”…



“What are the odds of getting this person better with this treatment?”


The greater above 1, the better.


This is a standardised, scale-free measure of the relative size of the effect of an intervention. It is particularly useful for quantifying effects measured on unfamiliar or arbitrary scales and for comparing the relative sizes of effects from different studies.

EFFECT SIZE = (mean score of group 1) – (mean score of group 2) / SD (of either group, or even pooled data)


EXAMPLE: A study into effects of manual therapy on neck pain measured a Rx group (n=23) and a Control group (n=21), and considered a “cut-off” point for improved ROM as being an increase in at least 20deg rotation (so results can be dichotomised). Mean Rx score = 28deg (SD 4) Mean Control score = 16deg (SD 4) Results were:


+ve -ve Total
Control/Placebo group a9 b12 a+b21
Rx group c18 d5 c+d23
Total a+c       27 b+d17 a+b+c+d44

CER a/(a+b): 0.43 or 43%

EER c/(c+d): 0.78 or 78%

RR = EER/CER: 1.83

ARR = CER – EER: -0.35 or 35%

RRR = (CER-EER)/CER: -0.83 or 83% (a minus figure, so this would be RR Increase)

NNT = 1/ARR: 2.86, so say 3.

EEO: c/d = 3.6 CEO: a/b = 0.75

OR: EOR/CEO = 4.8

EFFECT SIZE =  28– 16   = 3 4


The clinical  story then…

“So, if untreated, my patient would have a 43% chance of getting better anyway.  But if treated, his chance of improvement would be 78%.  He is 1.83 times more likely to improve if I treat him. The absolute benefit of being treated would be 35%.  The treatment would increase the chance of improvement by 83%. I would need to treat 3 people (in the period of time relevant to the study) to achieve 1 positive outcome. The odds of him getting better with no treatment are 0.75, whereas if I treat him, the odds are much better, at 3.6. the odds are 3.6:0.75 (i.e. 4.8) of him improving with treatment. “

However… From a clinical reasoning perspective, we still need to understand what “43%, 78%, 35%, etc etc” MEANS… and that’s where the real fun starts:-)

Here’s a little video for y’all: http://www.youtube.com/watch?v=tsk788hW2Ms

Leave a comment

Filed under Uncategorized

Should cervical manipulations be abandoned?


This is really for physiotherapists, but please feel free to read whoever you are.

The Chartered Society of Physiotherapy (CSP) today reported on a British Medical Journal (BMJ) article by Wand et al about the practice of cervical manipulations.

The news report can be found here, and the original article can be found here.

There is a formal rapid response to the paper found here, which is from a group of physiotherapists.
This brief blog represents my opinions and not necessarily those of the co-authors of the response.

Wand et al conclude with a proposition that “manipulation of the cervical spine should be abandoned”, and call on professional bodies to adopt this stance as a formal policy. This is a strong conclusion which you would hope would be based on firm and conclusive data.  However, this is not the case. The data on risk, by self-admission of the BMJ paper authors, is inconsistent and provides nothing but uncertainty. The best of the data would suggest that adverse events to manipulation is extraordinarily low. The benefits of the intervention are, at worst, comparable with alternatives. The maths is simple at this level. Extremely low risk versus likely benefit. If the logic used to argue for the abandonment of manipulation in the BMJ paper was taken seriously (which is very unlikely), then most interventions in medicine and allied therapies would be scrapped. Take, for example oral contraceptives: risk of thrombosis (much higher than manipulation risks) versus comparable benefit.  Let alone non-medical interventions, like hotel hair-dryers, coffee, pencils, and tampons.

Anyway, these are old and tired arguments and not really central to the issues of the current paper and news story. This isn’t about manipulation. Personally, I have no interested in manipulations per se other than wondering, like anything health professionals do, whether it is of likely benefit for patients.  The concept of manipulations is, however, for some bizarre reason, the source of disproportionate concern and emotion.  And it seems it is this that continues to drive the ‘manipulations’ debate. The evidence is merely (mis)used as a tool to support some fixed, pre-established view-point. The ‘Head to Head’ stand-off in today’s BMJ was contrived and non-informative to all involved.  In some ways, the authors were victims of some mis-guided editorial whim.

The real issues at hand though are deeper. This is about the fervent, over-enthusiasm to drag something meaningful out of meaningless data in attempts to appear scientific and contribute to evidence-based practice, whilst paradoxically being pseudo-scientific and missing the point of evidence-based practice.  The potential harmful effects on associated professions and their patients are irreparable.  Physiotherapy is slowly but surely becoming stripped of all its worth.  A hundred-plus years of development and progress based on logic, intellect and science is undergoing painful erosion.  Interventions once the realm of the profession are being thrown out and taken up by others; exercise prescription, manual therapy, electro-therapy. We are becoming experts in handing out advice pamphlets. This is NOT because the interventions are ineffective. It is because they are deemed to have questionable efficacy based on sorry scraps of ‘evidence’ salvaged to adhere to the rhetoric of evidence-based practice.  Remember, most research findings are false (J.P.A. Ioannidid PLoSMed. 2, el124;2005), and this is more-so the case in physiotherapy (R. Kerry et al J Eval Clin Prac).

Very rarely is data considered in a scientific manner: in context of a priori beliefs; in context of the professional background for which it is intended; in the context of the dynamism of scientific discovery; in context of what we understand by cause and effect; in context of individual patients. Today’s claim of “abandoning manipulations” is simply another ludicrous reminder of the state we are in.  The CSP’s reporting is another example of a body blind to the deterioration of its very own profession.



Kevin Kelly responded:
I am in total agreement. It is unfortunate that the drugs industry is a multi-billion pound industry with tremendous lobbying power….any treatment that reduces the amount of drugs prescribed and helps patients should be used not criticised.The cherry-picking of poor quality research needlessly raises alarm in patients and does little to help the people suffering from neck pain and headaches to choose the most appropriate treatment.Neck manipulation has been shown to be safe and effective and benefits thousands of people suffering from neck pain and headaches. In fact, the risk of a stroke after treatment is the same whether you see a GP and get a prescription or see a chiropractor and get your neck adjusted.http://www.ncbi.nlm.nih.gov/pubmed/18204390

Manipulation of the neck is at least as effective as other medical treatments and is safer than many of the drugs used to treat similar conditions.http://www.ncbi.nlm.nih.gov/pubmed/17258728

11 months agoreid_heather (Twitter) responded:
Here hereAs a lecturer and teacher of spinal manipulation for 20years one can only assume that such articles are fuelled by ignorance.manipulation is a highly effective method of alleviating pain whose cause or source lies within the cervical spine. It is a skill we as physiotherapists should embrace rather than through fear discard. Whilst it has been shown that manipulation of the thoracic spine for some patients with cervical symptoms can be effective one must note the SOME. If we don’t continue to use these skills we will lose them. I certainly have not been a member of this profession for over 30 years to become reduced to leaflet giving.

I don’t usually have a rant about such matters but having just revised the history of medicine with my 16 year old daughter it seems that myths about health can be so powerful and disabling that it can prevent further progress. Whilst I do not suggest we reconsider Hippocrates take on the four humours and the balance of opposites he did give us a philosophical base to progress,whilst others thought illness was due to the anger of the Gods or evil spirits.

Benefits can be risky, however physiotherapists are also highly qualified in assessing the risk, and would only act when the risk is absolutely minimal.

11 months agoTaylorAlanJ (Twitter) responded:
So it all becomes clear! The BMJ have just made the web pages ‘pay per view’ . . . So it truly was a cynical marketing ploy that some folks fell for . . . Hook line and sinker!But where were the CSP when we needed them? . . . Sucking it all in and regurgitating it for good measure. None of this is helpful to a dying profession!It is ALL cumulative . . . In the last 2 months G.P’s have been given the impression that Whiplash injury doesn’t exist and now manipulation is harmful . . .

Who is standing up??

I have published my views in more detail onhttp://alteredhaemodynamics.blogspot.co.uk/


11 months agoHoward Turner responded:
Roger: beautifully put and a grave concern; there is no way to reverse this agenda if it goes too far. Has it gone too far already?My jubilee was a very wet tent in Wales – excellent thank you!
11 months agoRoger Kerry responded:
Hi Howard
Thanks for your supportive comments. Has it gone too far already? I think we are starting to see a generation of ‘de-skilled’ practioners coming through. If there was overwhelmeing evidence to confidently refute skill-based interventions, this would be fine, but there’s not.Hope your tent has dried out – Stick to leafy Cheshire!
10 months agoSteve Robson responded:
Roger makes many good points here about the erosion of skills within physiotherapy. This discourse regarding the safety of manipulation has come and gone many times over the years without resolution. But are we overlooking some fundamental issues here? Before those interested parties within physiotherapy once again ‘lock horns’ in an intra and inter professional struggle to retain their status as manipulators, what is it we as a profession feel is so valuable about manipulation?In circumstances of patient centred treatment and evidence based medicine, any treatment or mode of clininical management logically has to engage with the biopsychosocial status of each patient.
In terms of clinical reasoning, isn’t it essential that we attempt to answer some fundamental questions central to the use of manipulation? After all, decisions based on the use of manipulation as a technique come before an estimation of its safety is even necessary. From this perspective;(1) Essentially what is manipulation? (a definition if you like).
(2) Why are we using it, in other words, what do we hope to achieve by using manipulation?
(3) What are the mechanisms of manipulation, or quite simply, how does it work?

The answers to the above should drive the clinical reasoning process, and as such, what is best for the patient. Without this information evdence based clinical reasoning is not possible.
I would be genuinely interested to hear answers to the three questions above from those of us in physiotherapy who use manipulation.

Essentially, without this information the whole discussion regarding the use of manipulation at all is null and void.

10 months agoRoger Kerry responded:
Thanks for these comments Steve.
I agree that there are fundamental questions to be asked about all aspects of practice. Perhaps reducing the discussion to this level will help in understanding precisely what we are aiming for in clinical practice and research. Did you used to do a MT course exploring these issues?
10 months agoAlan responded:
I am a physiotherapist myself and it seems more and more that in order to continue with manual therapy in Britain I should have qualified in osteopathy or other related profession.It is no wonder that most of the private practice jobs in this country are being filled with our colleagues from the southern hemisphere, renowned for their manual therapy skills.I worry that the UK trained physiotherapist will be having more of an identity crisis in the years ahead!

Leave a comment

Filed under Uncategorized

Scientific Error and the Real-World

UPDATE 12th August 2013: The paper underpinning this blog has just been published proper. Here is the pdf if you are interested:

JECP -Truth paper 2013 – FINAL



IMPORTANT NOTICE: This blog relates to two academic papers published today. One is a paper on which I am the lead author. These comments are entirely my own and do not necessarily reflect the thoughts and opinions of my wonderful co-authors or the University of Nottingham.

There is a fascinating emerging phenomenon in the field of science: that science might be prone to systematic error. At least, there seems to be more attention to this of late. Scientists have always been aware of the potential for error in their methods.  However, today is special in relation to this subject (for me anyway) because first, there is an enlightening World View article published in Nature on this matter by Danial Sarewitz (D.Sarawitz Nature 485, 149), and second, a small team from the University of Nottingham have had published a research paper on this very subject (R. Kerry et al J Eval Clin Prac) (pdf above).

Sarewitz nudges towards a dimension of this phenomenon which is of utmost interest to me and my field of science, health science. That is that the systematic error observed in scientific method seems to be revealed only, or at least best, when the science is placed in the context of the real-world.  In health science we work in a framework known as evidence-based practice (EBP), and this is a living, object example of what Sarewitz is referring to.  EBP is solely concerned with the integration of scientific findings from rigorous research processes into the shop-floor decision-making of health care professionals.  So is scientific error witnessed in EBP? If so, how does that effect patient care?  These are big questions, but here are some thoughts on how their answers might be informed.

First, what does the state of health science look like with regard to this error.  John Ioannidis mid-noughties high-profile reports on the phenomena e.g.  ‘Why Most Published Research Findings are False’ (J.P.A. Ioannidid PLoSMed. 2, el124;2005) caused turbulence in the field of health science, mostly medicine. He provided evidence of systematic bias in ‘gold-standard’ scientific practices.  Our paper published today supports these findings: gold-standard research methods are not reliable truthmakers.  But this is only the case when ‘truth’ is defined as something outside of the methodological frameworks. We tried to find a definition which was as real-world as possible, yet as tightly related to the scientific methods as possible, i.e. conclusions from systematic reviews or clinical guidelines. Right up to this point, the science looks good: tight control for bias, well-powered, apparently externally valid. However, the moment you step-out of the science itself, things look very different. We found that in-fact there was no reliably meaningful indication of truthfulness of a single controlled trial by its internal markers of bias control. So although a trial looks great for internal validity, this quality does not translate to the out-side world.  We are not the first to question the value of markers of bias control for real-world applicability.

Sarewitz states: “Researchers seek to reduce bias through tightly controlled experimental investigations. In doing so, however, they are moving farther away from the real world complexity in which scientific results must be applied to solve problems”. Voilà. The paradox is clear: the tighter trials get for controlling for bias, the less relevance they have to real-world decision making. Sarewitz also suggested that if biases were random, multiple studies ought to converge on truth. Our findings showed that in the trials examined, throughout time (and given that more recent trials tended to be the higher quality ones), study outcomes tended to diverge from the truth. So, the most recent and highest quality trials were the worst predictors of truth.

There are strong scientific, professional, educational and political drivers surrounding this issue: funders base their decisions of proposals that show greatest rigor; health scientists get better at constructing trials which are more likely to establish causal relationships (i.e. control better for bias); journals insist on trial adherence to standards of bias control; scientific panels for conferences examine abstracts for bias control; students are taught about evidential hierarchies and why highly controlled studies sit at the peak of health science; health care commissioners seek to make purchasing decisions on sight of the highest-quality evidence.

However, all may not seem so bleak. There are a couple of movements which initially appear to offer some light. First, the attempts by researchers in making their trials more representative of the real-world. For example, the interventions under investigation being more relevant to common practice and mechanistic principles; the trial sample being more representative of the population; outcomes being more meaningful. Second, Universities and funders are becoming more concerned with ‘knowledge transfer’ methods. The idea being to seek ways to get the great internally valid research findings into real-world practice.  It is clear that these two strategies are missing the point. More representative trials are still going to be confined to internal constraints optimising internal validity. If not, their defining characteristic – to establish causal relationships – will be diminished. It seems a poor situation – you’re damned if you do, you’re damned if you don’t. Knowledge transfer strategies are at risk of further exaggerating the asymmetry between research findings and real-world practice, in-fact. “Let’s just push our findings harder onto society”.
There is no quick, easy solution. However, Sarewitz alludes to the key for potential advancement with regard to this phenomena: real-world complexity. This won’t go away.  Until the real-world is ready and able to absorb abstract knowledge, it is unlikely that simply improving the internal quality of science will make any difference. In fact, it could serve to harm.  The relationship between science and society needs to become more symmetrical. Who knows how this can happen?  Examples as cases in point might be that the nature of ‘bias’ needs to be re-considered. Or our notion of what is understood by causation needs investigating. The causal processes in the real-world might be very different to those observed in a trial, despite the “external validity” of a trial. Even if the real-world could be captured in a trial, the moment the trial finishes, that world might have changed.

Today is a great day for science, and a great day for the real-world – as is every day. Let’s keep science real.

Posterous view: 470

Posterous comments:

Interesting blog, thanks!In the 4th para and second last line, you seem to be suggesting by ‘study outcomes tended to diverge from the truth’ that this is one truth? are you meaning to say this. I would have thought you would only be able to observe the results diverging and not agreeing.
6 months agoRoger Kerry responded:
Roger Kerry
Hi Nikki
Thanks for your comment. OK, so I am using the word “truth” to mean “scientific truth”, i.e. the outcome of scientific investigations. Of course this is variable over space-time, but I am premising this on the broad notion that the core activity of science is to reach some sort of consensus. E.g. physics edges towards an understanding of the origins of the universe. What I mean by “diverge” is that looking BACKWARDS from what we know today as the “truth”, it seems like there is no pattern to how scientific studies relate to this. I think if we were looking FORWARD we can talk in terms of “agreeing”, e.g. multiple studies seem to be agreeing with each other, the outcome of this agreement could be called scientific truth. If this purpose of science is accepted, then when looking BACKWARDS you would expect a clear pattern whereby studies, particularly the best quality studies, converged towards the “truth”. So, how did we get to this “truth” which doesn’t relate to trial outcomes? We defined it in numerous ways, e.g. systematic review outcomes, clinical guidelines, totality of epidemiological + mechanistic evidence. What was clear is that no matter how you define “truth”, a “scientific” progression / convergence cannot be identified in RCT findings. i.e. RCT outcomes are random, statistical accidents. Assuming our method is valid, I see a number of explanations / ways out of the problem: 1) you simply can’t/shouldn’t attempt to define “truth”: you just roll with study outcomes day-by-day and don’t worry about it; 2) it is not the purpose of science to edge towards truth; 3) truth is constructed, not discovered 4) health research is not a scientific activity; 5) the purpose of health research is justification, not discovery. I think you would have trouble excepting 1) and 2). Researchers won’t like 3) and 4), which leaves 5). I think this is the best position for health research to hold. However if this is the case then RESEARCH SHOULD NOT BE USED TO INFORM PRACTICE OR PREDICT OUTCOMES. It should be purely a way of providing evidence for something that happened (in the past), like a certificate is evidence of attending a CPD course, but it does not predict future attendance. This would appease commissioners / our own agenda etc, but turns EBP into rhetoric, and not a science-based activity. Of course I agree that there are many truths, bUt I have focussed on an interpretation of scientific truth here.Apologies for rambling! Hope all is well


Filed under Uncategorized