Bayes and Belief: How pre-review belief influences critical appraisal. St Emlyn's

I’ve been reflecting on why there is disagreement about how influential new evidence is to clinical practice in the last year or so. COVID-19 has shown that the thresholds for what we believe and when vary considerably between individuals and organisations, but in truth the point at which we personally make a decision varies enormously between individuals.

One way to think about this is in relation to the adoption curve that I suspect you are very familiar with (Roger’s Bell Curve) as shown below.

https://en.wikipedia.org/wiki/Technology_adoption_life_cycle

However, I increasingly think that the idea that there are different sorts of people who have innate preferences for adoption is not consistently applied for a particular decision or individual in medicine. Rather, our decision points for change are personal thresholds based upon our pre-test belief systems. In other words our approach to change is not so much about the weight of evidence viewed in an objective and independent way, but rather by how much it shifts our pre-test belief about whatever it is that we are looking at. This will inevitably differ depending on what question is being considered.

Decisions as probabilities.

Think about what makes you change your practice. Consider a change that you’ve made in the last year to maybe use a different drug or technique. When you made that decision how certain were you that it was the right thing to do? Were you 100% sure and with no doubt whatsoever that you were right? I suspect not (or at least I hope not), as it’s wise to be skeptical in medicine as we know that evidence changes over time and that what we might think of as the gold standard of practice today might well change as new evidence comes to light (a technique also known as ‘science’).

As an example think about the use of Verapamil vs. Adenosine for the treatment of SVT in the emergency department. Twitter will give you many examples in 2020 of people changing their minds about which to use first, but an objective assessment of those arguments would conclude that there is relatively limited evidence. For the record I’ve been a Verapamil first person for many years following a thread and podcast from these Australian folks on the RAGE podcast. My practice is based on the fact that I ‘believe’ this is the right choice due to a complex interaction of the evidence, my experience in practice, and my personal innate tendency to innovate and to challenge the status quo. The point being that it is not an entirely rational or unemotional decision. My ‘belief’ is that on the balance of probabilities Verapamil is a better drug for my patients, but that belief is a complex construct that extends beyond the published evidence.

My decision is also based upon what my previous beliefs were before I thought about change. My clinical experience has shown me that Adenosine is an effective drug but I get quite upset about how unpleasant my patients find the experience of receiving the drug. I am therefore in a position where I am keen to change, and therefore the weight of evidence required to make me change will be relatively modest. This was not always the case though. In my earlier career I was quite convinced that Adenosine was the drug of choice as that’s what my peers and bosses used, and therefore it would have taken more evidence to shift my position. I also found the fact that Adenosine works by effectively inducing asystole somewhat amazing. It is an exciting and remarkable drug to use in practice, but it’s also an awful experience for many patients.

This concept of being ‘ready’ for change has a sound basis in educational theory and can be related to the Lewis change cycle, which argues that in order to change we need to be ‘unfrozen’ from our past beliefs in order to change and then ‘re-freeze’ into a new style of practice.

Change is thus dynamic and influenced by pre-change beliefs. We can only change if we are in the right frame of mind to accept the idea of change. Similarly, if we are in a frozen state of thinking then change is going to be very difficult to achieve. I suspect that we have all come across individuals and ideas that are seemingly permafrost in this regard.

Beyond the dichotomy of freezing.

Whilst I really like the Lewis change cycle as a concept I do find that it is rather dichotomous. The idea that we are either ready to accept new ideas or not as perceived as binary states does not fit well with the complexity of the worlds that we live in and the decisions that we make.

It may be be more interesting to take a Bayesian approach to change and in 2020 we’ve had some great examples of how this can work in practice.

For the purposes of this blog I’m going to use TXA as an example as it’s a great case study that is quite recent and well discussed in the #FOAMed world, although I think there are lots of other examples out there too. This blog is not about defending my position on TXA, but rather my thoughts on what influences my decisions.

TXA and head injury

In 2020 we saw the publication of the CRASH-3 study. It’s the largest study of TXA in head injury that we’ve ever seen and that we are ever likely to see. My conclusion was that on the balance of probabilities we should use TXA in moderate head injury as I ‘believe’ that the weight of evidence is such that it is the right thing to do. I’m happy to say that I regularly prescribe it in the ED and that I have no qualms about doing so. I ‘believe’ that I am right in this decision.

However, some of the brightest, most able, well read, well educated and widely published EBM nerds (a compliment in this setting), disagree with me. Whilst I can call on many allies to support my view, there is not even unanimity amongst colleagues within the St Emlyn’s group who remain sceptical about whether we should use TXA in head injury.

How can it be that individuals who all subscribe to the idea of evidence based medicine, critical appraisal and scepticism can come to quite different conclusions about the same evidence?

At this point I would mention my sceptical friends Justin Morgenstern and Ken Milne, both of whom have looked at the data, as we have and yet have come to very different conclusions to me here and here. We all agree on the trial data, we’ve all looked at the same words, numbers and graphs and we even agree on where the doubts and flaws are, but when it comes to making a decision as to what to do then we politely disagree. How can this be and why does it happen? My thoughts are below in relation to CRASH-3, but I am only using this trial as a well known example. The ideas here are applicable to many other papers and situations. It’s worth pointing out very clearly that this blog is not a criticism of others views on TXA or any other topic. Justin and Ken’s positions are just as valid as mine and I totally respect their wisdom on this. Over time I’ve listened and learned a lot from them on this topic and more, and in truth my enthusiasm for TXA in head injury has waned a little since we first published on it (although we are still using it here in Virchester).

It’s well worth reading Justin and Ken’s posts on TXA as they are excellent summaries of the evidence and great examples of how to perform critical appraisal well. You will find little to argue with here, but note that when the question of whether this changes practice, they (and many others) have come to the conclusion that the result is hypothesis generating (it is) and that we should be very sceptical about whether we should (and they are right in this), but when the crunch question of whether we should use it in practice comes along we have gone in slightly different directions.

CRASH 3: TXA is no wonder drug

SGEM#270: CRASH-3 TXA for Traumatic Head Bleeds?

Is it about critical appraisal techniques?

An obvious thought would be to question whether those with different conclusions have read different evidence, or misinterpreted the facts, but this does not appear to be the case. From what I have read in blogs, tweets and listened in podcasts there is little if any disagreement about the trial methods, analysis or results when interpreting trials such as CRASH-3. The trials are well reported and there is a reasonable data trail to follow the analysis and evolution of the final manuscript.

The trial is really interesting in that the sub-group analyses are hotly debated as to whether they are valid outcomes for many reasons, but notably because they do not appear to be specified in the original statistical management plan. This is clearly an issue, but the interpretation of this, and the subsequent ‘beliefs’ that follow differ. There are clearly other issues in the trial but for the purposes of today I’m going to focus on this issue around the interpretation of sub-groups and the primary outcome of death as a result of head injury at 28 days.

The bottom line is that there appears to be little if any difference in how different appraisers find and record the data. The facts are there in the manuscript and we all agree on them. So it’s not really about the main part of critical appraisal. It must be something else.

Pre-test beliefs and Bayes in the interpretation of trial data.

Whenever you or I read a paper in our field of practice we come to it with some prior knowledge and beliefs about the topic. This is, of course, a good thing and fits with the idea of being a constructivist learner. Prior knowledge and understanding helps us interpret new data in the context of what is already known and that helps us decide what to believe and when to change.

In this concept I would argue that prior beliefs are one of the most important factors in deciding whether to change at all.

Let’s return to the TXA example.

If I return to the time before CRASH-3 was published what would my pre-test belief system look like? The important point here is that these are largely ‘beliefs’ and interpretations rather than hard facts. As such you may consider these as exerting bias even before even reading the manuscript.

I have used TXA for bleeding trauma patients since CRASH-2 was published (a trial that included UK patients)
TXA is a key performance indicator in our trauma systems
I don’t believe I’ve ever seen any evidence of harm from TXA in trauma patients
I have read about and understand the mechanism and know a little about acute traumatic coagulopathy/trauma induced coagulopathy.
I have friends such as Prof. Karim Brohi who is one of the lead proponents of TXA in trauma and I’ve met many other UK researchers in the field (e.g, Ross Davenport, Ian Roberts etc. ). I’ve had time to chat to them about coagulopathy and my views are undoubtedly influenced by their teaching and wisdom.
I have watched this awesome SMACC video from Karim Brohi
I have been a principal investigator on other TXA trials (i.e. HALT-IT).

The bottom line here is that pre-publication of CRASH-3 it is likely that I would need little additional evidence to make me believe that TXA works in a traumatic bleeding patient population.

Contrast that with what I imagine was in the mind of some of the TXA sceptics. Not so much the Ken’s and Justin’s of this world, but some of the comments on twitter regarding this trial and those from other contributors were really quite extreme.

Often from North America (which did not recruit to CRASH- 2 and only in Canada for CRASH-3) so not felt to be relevant to local systems and practice
TXA has not been a standard of care for trauma in many areas of the world
Some very vocal anti-TXA media from North American clinicians (not Ken and Justin who have always sought a balanced and evidence based approach)
Concern that TXA may be become a routine treatment and thus a key performance indicator on what they believe is low quality evidence

This is why I hate the TXA literature. The hype around this drug is insane.
— EM (@EM_phile) October 15, 2019

So in contrast to my world, where my pre-publication belief was that TXA will probably work, there are others who will have a pre-publication set of beliefs that were in the opposing direction.

Bayes and belief

We have talked about Bayes theorem in respect to diagnostics and in that regard it’s very effective. We calculate a pre-test probability of disease and then use that in conjunction with the evidence to reach a post test probability of the patient having the disease in question.

The same may be true in the interpretation of new evidence. If someone has a strong belief that a new treatment does not work then it is going to take a LOT of evidence to change their minds. Similarly if they already believe in the therapy then either a great strength of evidence is going to be required to stop them using it, or little effort is required for them to believe in a new related use.

I like to think of this as balance between the strength of new evidence and then the introduction of new evidence into that belief system.

Many of us like to think that when approaching a decision we arrive with a neutral perspective. The decision as to whether to change or not change is not pre-determined, not influenced by prior beliefs, or if so then these are in equal measure. It might look a little like the figure below.

In reality we almost always come to a decision with past experience, knowledge and wisdom. and thoughts. And so if I were looking at a topic such as TXA in head injury I am likely to come to the decision with a pre-loading of beliefs that are going to influence my thinking as in the figure below with a small but discernible a-priori loading to believe evidence that supports my initial view.

Thus when new evidence comes along it takes relatively little to confirm that it is indeed effective (as in keeping with past view) OR a considerably stronger result to refute that belief.

Clearly a similar issue arises when someone has an a-priori belief that a therapy does not work. It will take a disproportionate amount of evidence to move them into the change/adoption area or belief.

We can express this using a nomogram as shown below. At the current time there is no real way to quanitify the axes in the way that a likelihood ratio calculator might, but I hope it gives an illustration of how pre and post review beliefs might interact.

In the diagram above the Green arrow represents someone who has strong pre-review beliefs that a treatment will work and continues to do so as the evidence for change is weak. The Blue arrow represents a reasonably sceptical perspective that is unchanged by mediocre evidence and the Orange arrow represents someone who does not believe that the treatment works. Three different individuals coming to different conclusions regarding post review beliefs and thus change.

Perhaps more interestingly we can run a similar thought experiment with the same person (so the same pre-review belief) but with different levels of evidence.

In this example the Blue and Orange arrows show that as individuals we can and do use evidence to change our decisions (which is of course quite reassuring).

What is certain is that our pre-review beliefs have an influence on our post-review beliefs. If you’re like me then you will recognise that we need to reduce bias, and that by recognising this influence we should aim to mitigate it.

We should also be mindful in this specific TXA example of this recent teaching from Ken Milne in respect of subgroup analyses.

Correct answer was ZERO percent based on Wallach et al JAMA Intern Med 2017 #EBM #FOAMed https://t.co/RYQPpvWx5b pic.twitter.com/Y7hVHN0H2o
— Ken Milne MD (@TheSGEM) April 7, 2021

From Bayesian perspective on subgroups my pre-test probability that they will attempt replication of CRASH-3 is low and that replication will confirm subgroup finding is very low. https://t.co/tEWP8drdZE pic.twitter.com/S83KkQypGk
— Ken Milne MD (@TheSGEM) April 22, 2021

How do we reduce the influence of self on decision making and change?

My first thought is that we acknowledge it. We accept that our own erratic, biased, invested viewpoints are important starting points when critically appraising a topic we are interested in. Perhaps we should read the aims of a study and then write down how strongly we think the null hypothesis will be found to be true before reading on. Such a task would focus the mind on our a-priori beliefs and biases. Something like the visual analogue scale below might work.

And then there was epistaxis.

Since CRASH-3 we have also seen the results of the NoPAC trial of epistaxis management with TXA. My view is that this trial showed no additional benefit to TXA as a topical treatment after basic measures had failed. I had imagined that this would lead to it being abandoned as a treatment, but I was surprised to read so many colleagues (in my time-line mostly North Americans), doubt the results and to advocate continuing use. The same issues seemed to be arising that pre-publication beliefs were so strong that it worked, that even the NoPAC trial (which was very negative) were not enough to counter that view. Contrast @EMSwami advocating a reappraisal with other authors below.

TXA wasn't better than placebo in epistaxis in this well done RCT https://t.co/qA3yEXyvaM

Excellent work @adamreuben + team

Many of us (myself included) believed TXA a game changer in epistaxis

This trial must make us reexamine our practice pic.twitter.com/5QQsihlGC6
— Anand Swaminathan MD MPH 🏳️‍🌈🇺🇦 (@EMSwami) February 20, 2021

I’m not sure one study is practice changing, practice questioning? Sure. But I’d stop short of saying that TXA isn’t recommended for epistaxis after one (albeit decent) study.
— Howie Mell's former account (@DrHowieMell) March 1, 2021

Final thoughts

The decision to make a change in practice is only partly based on the quality of published evidence. The individual clinical decision, and thus the treatment given to the patient, is also influenced by our a-priori beliefs about the effectiveness of the therapy. All clinicians should be mindful of their own biases when interpreting the evidence base, and on subsequently making clinical decisions on whether to act upon it.

What this means for me is that whenever I read a paper, the question I should ask of myself is whether or not I am carrying personal beliefs that will bias my decision making (especially if I disagree with someone else). Hopefully this exercise will help me be a better clinician.

@EMManchester

References

Simon Carley, “Educational theories you must know. Kurt Lewin change cycle. St.Emlyn’s,” in St.Emlyn’s, November 4, 2015, https://www.stemlynsblog.org/better-learning/educational-theories-you-must-know-st-emlyns/educational-theories-you-must-know-kurt-lewin-change-cycle-st-emlyns/.
Richard Carden, “I’m all about the Bayes, ’bout the Bayes, no treble,” in St.Emlyn’s, August 14, 2016, https://www.stemlynsblog.org/im-bayes-bout-bayes-no-treble/.
Simon Carley, “Educational theories you must know: Constructivism and Socio-constructivism.,” in St.Emlyn’s, November 4, 2015, https://www.stemlynsblog.org/better-learning/educational-theories-you-must-know-st-emlyns/educational-theories-you-must-know-constructivism-and-socio-constructivism/.
Chris Gray, “JC: Halt! It’s not time for TXA! Or is it? HALT-IT results at St Emlyn’s,” in St.Emlyn’s, June 23, 2020, https://www.stemlynsblog.org/halt-it-st-emlyns/.
NICE. Major Trauma Service Delivery. UK. https://www.nice.org.uk/guidance/ng40/evidence/full-guideline-pdf-2313258877
Justin Morgenstern, “The CRASH-2 trial (a review)”, First10EM blog, February 17, 2020. Available at: https://first10em.com/the-crash-2-trial/.
How CRASH-2 got it wrong – A response by Dr. Mark Walsh http://maryland.ccproject.com/2013/10/01/crash2-wrong-review-dr-mark-walsh/
Ken Milne: SGEM#80: CRASH-2 (CLASSIC PAPER) https://thesgem.com/2014/06/sgem80-crash-2-classic-paper/
Ken Milne: SGEM#270: CRASH-3 TXA FOR TRAUMATIC HEAD BLEEDS? https://thesgem.com/2019/10/sgem270-crash-3-txa-for-traumatic-head-bleeds/
Justin Morgenstern, “CRASH 3: TXA is no wonder drug”, First10EM blog, October 28, 2019. Available at: https://first10em.com/crash-3/.
Simon Carley, “JC: Tranexamic Acid (TXA) in Head Injury. The CRASH-3 results. St Emlyn’s,” in St.Emlyn’s, October 14, 2019, https://www.stemlynsblog.org/jc-tranexamic-acid-txa-in-head-injury-the-crash-3-results-st-emlyns/.
Brohi on TXA in trauma. SMACC conference. https://smacc.net.au/2015/10/karim-brohi-on-tranexamic-acid-in-trauma/

Kirsty Challen

April 27, 2021 at 9:39 am

A thought experiment as a follow-on from your visual analogue scale:
1. Read the aims of the study and establish your pre-test probability (based on previous literature etc).
2. Read the methods and decide if they are fit for purpose and if they are generalisable to your population.

Then, and then only
3. Read the results.

If 3. surprises you, or makes you uncomfortable, restrain yourself from revisiting 2. to find extra flaws…….