Many thanks to the twitter FOAMites for alerting us to an online first paper in the NEJM on the use of therapeutic hypothermia post cardiac arrest. If you’ve followed St.Emlyn’s for any time then you will know that @drgdh has a bit of an interest in the subject and having looked at the previous literature we felt that patients should be cooled.
That’s where the evidence stood until yesterday, but one of the great things about emergency medicine/critical care is the realization that half of what you know is probably wrong, you just don’t know which half. This paper in the NEJM challenges the previous doctrine of cooling our post cardiac arrest patients to a target temperature of 32-34C with recent suggestions to try and achieve this as fast as possible – even during cardiac arrest itself.
The Intensive Care Network have declared it as a superb study and have suggested that we need to reassess our cooling strategies. Click on this link and have a read of what they have to say. I agree with most of it…but I think they and others may have taken the evidence one step too far.
The twittersphere has gone a bit nuts about this too….
I could go on…., but it seems that on the basis of this trial the ice in the ICU freezer is being taken out and defrosted.
So what’s changed then? Well there was still some doubts about how hypothermia works. In some of the past RCTs the control group contained patients with fevers…., so was hypothermia acting by cooling the patient, or by preventing hyperthermia?
As uncertainty existed Nielsen et al designed an RCT of two targeted temperature regimes comparing a target of 33C vs 36C after out of hospital cardiac arrest. The article abstract is below, but as ever make sure that you read the full paper if you have access to the NEJM.
[DDET Who was studied?]
The trial included patients over 18 with a low GCS (<8) after out of hospital cardiac arrest. They included patients irrespective of initial rhythm (previous trials have suggested a greater benefit in those with VF arrests). So, it’s a pragmatic and appropriate group to randomize.
Patients were recruited through the admissions process to ICUs with 36 units in Europe and Australia recruiting.
Over a 2 year, 3 month period they recruited 950 patients (impressive).
There appear to be no major differences between patients in each group.[/DDET]
[DDET Tell me more about the intervention]
Patients were treated the same apart from in respect of the target temperature. All were cooled to target for 28 hours before warming began in 0.5C/hour increments. Sedation was turned off at 36 hours with normal prevention of temperatures >37.5C until 72 hours post arrest.
Apart from the target temperature the groups were treated identically. It’s a nice and simple design and a good example of how an RCT can target an intervention analysis.
[DDET What were the outcome measures and what did they find?]
The principle outcome was death – a pretty good end point that few can argue with. At the end of the intervention 50% died in the 33C group and 48% in the 36C group. An absolute risk reduction of 2% (actually it’s 1.4% but there was some rounding up going on) but the hazard ratio of 0.89 has confidence intervals that extend from 0.89 to 1.28 (so this is a statistically non-significant result).
At 180 days there remained a 2% difference but again there was no statistically significant difference between the groups, nor was there any detectable difference between patients in terms of disability (using modified Rankin scale).
So really little to choose between the interventions, or at least not that we can detect in this study. [/DDET]
[DDET So this is definitive then? Is there really no difference?]
Ah, steady on there chaps(esses). There is a world of difference between not being able to find a difference between two thingies and two thingies being the same. Perhaps we need to look at the sample size again.
My first impression when I looked at some of the tweets coming through from twitter were that this was a fairly small study to demonstrate equivalence, and that’s perhaps because it was not set up to detect equivalence at all.
In this study the apriori sample states that they were looking to find a 20% difference in the hazard ratio between the two therapies. This is a bit tricky to get your head round, but thankfully they have given us an example in the power calculation section in that they would have required 850 patients to be 80% certain of detecting a difference of 11% between patients…..just think that through again…
That’s a number needed to treat of 9.
So they were aiming to find an NNT of 9 by comparing 33C with 36C
That’s a bit ambitious….
…so perhaps not surprising that they did not find it then…
Let’s think of it another way. Say we were going forward from this study and we wanted to look for a 2% difference in survival (i.e. what they found) between these two cooling therapies. That’s a clinically important difference folks a real time Number Needed to Treat (NNT) of 50. That 2% ARR/NNT of 50 (it’s the same thing) is as good as aspirin in MI or Tranexamic acid in major trauma (really effective and fab treatments you will agree). Maybe we would think about our sample size calculations as follows..
The difference we hope to find is 2% (that’s what they found)
The expect our event rates (death) to be 48% and 50% respectively (that’s what they found)
So that’s an NNT of 50 (similar to Aspirin in MI or TXA in major trauma)
Let’s go for a power of 80%
Let’s go for an alpha of 5%
Hang on a sec….just whizzing into the computer….and…the answer is…
9806 patients (for each sample – 19,612 in total)
(Ed- gosh that’s a lot of patients!!!)
It is but not out of keeping with the number needed in CRASH 2 to show a similar difference.
Hypothetical Analysis 1: We can – because we are statistical geeks also ‘back calculate’ the power of this trial to detect a difference of 2% (48%-50%) based on a sample size of 950 patients. I’ll spare you the details, but it’s about 14%.
Hypothetical Analysis 2: We might say that 2% is too small. What is the power of this trial to detect a difference between 47% and 52% (a NNT of 20) – well again it’s sadly many more than 950. In fact it’s about a 60% chance of being able to demonstrate a 5% difference.
Hypothetical Analysis 3: Let’s work out the confidence intervals around that difference they found. It’s quite tough to do with hazard ratios, but we can consider the absolute risk reduction and work out the confidence intervals for that. OK. The actual ARR is 1.4% (they round there figures up in the paper) and the 95% confidence intervals extend from a 5.4% ARR in favour of 33C all the way up to an 8% ARR in favour of 36C. Is your head hurting – if so lets convert that into NNTs. Basically the confidence intervals for this result extend from an NNT of 19 in favour of 33C all the way up to an NNT of 12.5 in favour of 36C.
Are you feeling convinced that there is a difference here? Are you convinced that the results really demonstrate that they are the same? I’m not.
The bottom line here is that we cannot equate a negative finding in this study to an assumption that these two therapies are equivalent. They have failed to find a difference, but they have not demonstrated that there is no difference. The real difference between these therapies could really be quite large AND clinically significant but this study is just too small to detect it.
If you want to have a go at playing around with stats calculations and sample size I recommend statspages, as there’s loads of good stuff on there.
The sample size calculators are here.
[DDET My head hurts with all this stats mumbo-jumbo]
Yeah, sorry about that. The thing is that we need to be really careful when designing studies about what we define as clinically important differences. In this study the power calculation is based on a 10-11% difference in mortality which to be honest is huge. In the Cochrane review of cooling the difference found was roughly 9% (Cochrane study pooled data Analysis 2.1), so this study was attempting to find a similar or greater difference than that comparing placebo against cooling.
We can of course debate what a clinically important difference is, but in terms of death a 2% absolute difference in survival really means a lot to someone like me. I think 10-11% is just a bit too ambitious, but that’s only my opinion and I’m sure others will think differently.
If there are any true statisticians out there then they will be foaming at the mouth at my mixing of non-inferiority principles of trial design with trials designed to detect difference. It is quite true that they require different approaches and in general non-inferiority trials require more patients than more traditional models. I apologise to the statisticians, but I think there is enough here to tell us that this trial is too small to detect what I believe is a potentially clinically important difference.
Even the NEJM Editorial gets in on the act and declares that there is no difference….but have they demonstrated that in this trial? Answer – no.
The conclusion? Well this is a really interesting study but it can only tell us so much. The authors have failed to detect a really big difference in clinical outcomes between these two therapies, that is simply not the same as presuming that they are the same.[/DDET]
[DDET So 33C or 36C what are you saying – what should I do?]
It’s not for me to tell you what to do really. I like cooling patients, I think it works and there is evidence to support it. As for a target temperature then this trial really gets us thinking. It’s not definitive but there is enough to get us interested in questioning whether a prevention of fever is as important as cooling to hypothermic levels. There may well be no difference, there may be harm, there may even be benefit to a lower cooling temperature I just don’t know and I’m not sure this trial helps us that much. As a clinician, and if I was in the ICU I would be faced with the dilemma of uncertain evidence vs. the known harms/difficulties of cooling. Overall there seems to be little with which to base a decision.
Tough call chaps(esses). Let me know what you think.
Sorry folks, but this is a negative study of difference NOT a positive study of similarity.
Basically just chill out…………
This paper has been a fantastic demonstration of the power of #FOAMed to deliver real critical appraisal, understanding and depth to the results of a really interesting and challenging paper. If you’ve made it this far then please read the comments below and also head over to LITFL for further insights and links to other amazing resources. Viva la FOAMed.
What I’ve really enjoyed is the cross speciality interest and understanding on this topic and I’m really pleased to say that the insights and contributions of colleagues have pushed me off my metaphorical fence onto the side of……, well read the comments below and find out 🙂