
The Skeptics Guide to Emergency Medicine SGEM Xtra: The NNT is Mellow Yellow for tPA in Acute Ischemic Stroke
May 1, 2021
28:22
Date: April 30th, 2021
Guest Skeptic: Dr. Justin Morgenstern is an emergency physician and the creator of the excellent #FOAMed project called First10EM.com. He is also one of the SGEM Hot Off the Press Faculty.
Reference: Donaldson et al. Review article: Why is there still a debate regarding the safety and efficacy of intravenous thrombolysis in the management of presumed acute ischaemic stroke? A systematic review and meta-analysis. Emerg Med Australas 2016.
This SGEM Xtra is based on the new recommendation on TheNNT website for tPA in acute ischemic stroke.
This is the third time there has been a recommendation on this topic. The first review gave thrombolytics a "red color recommendation: no benefit." The second review gave alteplase, a single agent, a "green color recommendation: benefit>harm." Since no relevant trials were published between the two and both author groups examined essentially the same data and arrived at opposing conclusions, we wanted to understand and try to explain the conflicting interpretations.
Our interpretation of the available literature was to give it a “yellow colour recommendation: net benefits and harms unclear due to uncertainty in data”. This resulted in the summary statistic of the benefit NNT (not reported: Uncertain) and Harms in NNT (not reported: Uncertain). More details on the NNT Rating System are available.
It would be hubris to presume that our summary would arrive at the one true answer. But our goal wasn’t to provide an answer. Our goal was simply to explain the science as well as we could, so people could understand why there is a debate – and the uncertainty that underlies that debate.
The Donaldson et al SRMA included 10,431 patients in 26 randomized trials comparing intravenous thrombolysis with placebo or standard care in acute ischemic stroke [1]. Their efficacy endpoint was good functional outcome, defined as a modified Rankin Score (mRS) of 3 or less. This is defined as some residual disability requiring assistance but able to walk and care for personal needs independently. The harm endpoints were symptomatic intracranial hemorrhage (as defined by individual trials) and overall mortality
The authors report a 3.2% improvement in good neurologic outcome, a 5.4% increase in symptomatic intracranial hemorrhage, and a 2.5% increase in mortality. However, we question the certainty implied by these summary numbers.
Emberson and colleagues reported only on alteplase (a problem we will discuss further) and found a 5% improvement in neurologic outcomes, a 5.5% increase in intracranial hemorrhage, and a 1.4% increase in 90-day mortality that was not statistically significant [2].
A 2014 Cochrane review by Wardlaw et al and arrived at similar conclusions with significant improvement in neurologic outcomes, increased intracranial hemorrhage, and increased mortality [3]. Thus, our conclusions and discussion are unchanged by choice of review and reflect our belief that pooling data on this topic is overly simplistic and masks profound uncertainty.
We both really like TheNNT website, and the NNT as a concept. But there are problems with the NNT if used in isolation. One of the great conceptual difficulties of summary statistics like the number-needed-to-treat (NNT) is the implication of certainty. A major strength of the NNT is its simplicity, making complex research easier to understand. A weakness, however, is also its simplicity, because it can hide the complexity of research, ignore confidence intervals, and obscure biases. For most topics, these details are far more important than any individual number.
There is an SGEM Xtra on some of the limitations of the NNT/NNH summary statistics called the NNT - WET or DRI? It was based on an article published Dec 2019 in AEM by Reeves and Reynolds.
There are multiple sources or uncertainty around thrombolytics and stroke which we discussed in TheNNT recommendation.
Conflicting Individual Trial Results
The first source of uncertainty we highlighted was conflicting individual trial results. Among 26 trials in this systematic review by Donaldson et al, 24 research groups found no benefit in their selected primary outcome [1]. And the two that claim a benefit (NINDS part 2 and ECASS III) both had baseline imbalances that may explain the difference [4,5]. In fact, there are re-analyses that adjust for those imbalances in both trials, and the benefits disappear [6,7]. However, in some re-analyses of NINDS-2 the benefit is maintained, which adds to the uncertainty here [8,9].
We reviewed the NINDS trial with Dr. Swaminathan back on SGEM#70. More recently Prof Fatovich and I reviewed the reanalysis of ECASS-3 by Dr. Brian Alper on SGEM#297.
Clinical Heterogeneity of Individual Trials
Another source of uncertainty is the clinical heterogeneity of individual trials. The 26 trials are clinically heterogeneous, enrolling stroke patients of differing demographics, treatment times, stroke severities, anatomic territories, and thrombolytic agents. The author of the first NNT summary felt this was too much heterogeneity for appropriate pooling, a position supported by the major differences in conclusions drawn depending on which studies an author group chooses to include.
Selective Emphasis on Trials Claiming Benefit
There was also the selective emphasis on trials claiming benefit. It is circular and erroneous logic to claim efficacy for thrombolytics based on the trial characteristics of the two positive trials. First, there is legitimate debate about whether they were truly positive. Second, selectively highlighting positive results is a form of the "Texas sharpshooter fallacy".
The Texas sharpshooter fallacy is committed when you cherry-picked a data cluster to suit your argument or found a pattern to fit a presumption. It comes from concept of a marksman shooting at the side of a barn. After firing multiple shots, they go up to the barn and draw the target around the spot where there are the most bullet holes.
For example, because both NINDS II and ECASS III used alteplase, some have suggested alteplase is a superior agent [4,5]. However, on close inspection, that logic falters: few trials have compared thrombolytic agents head to head, so there is no strong evidence to support that claim. There are nine additional trials of alteplase are negative. And systematic reviews consistently find no heterogeneity of effect between agents – in other worse, statistically speaking the different thrombolytics all look the same for efficacy[3,5,10].
Moreover, in evaluating drug efficacy, establishing a class effect is generally a prerequisite for debating or comparing individual agents [11]. Therefore, while it may increase complexity, we believe it is a mistake to exclusively examine data from the agent used in the two trials that claimed benefit. You can’t just retrospectively decide to throw out the trials you don’t agree with.
Likewise, while there are theoretical reasons to think early treatment is better, this has not been directly tested and is not strongly supported by data. Neither Donaldson et al. nor the Cochrane review find an interaction between time to treatment and effect. IST-3, the largest placebo controlled randomized trial of thrombolytics for stroke, found better outcomes among those treated after 4.5 hours than in patients treated at 3-4.5 hours from onset of stroke symptoms [12]. Again, we feel it is best to consider this literature as a whole rather than using time windows selected based on outlying (i.e. positive) results.
Individual Trial Bias
Another source of uncertainty was individual trial bias. Bias is a major source of uncertainty in all scientific research. Importantly, using the GRADE tool [13], Donaldson et al. rate the risk of bias as “serious” for all outcomes. One notable source is the outcome scales used, for instance the modified Rankin Scale (mRS) score. This score is known to have some subjectivity with poor inter-rater reliability and questionable validity. When trained neurologists examine the same patients there is substantial variability in mRS score assignments [14,15]. Compounding the problem, some trials assessed patients by phone or mail, a choice certain to increase variability and imprecision. For example in IST-3, which contributes nearly 40% of subjects in the Donaldson meta-analysis, results were obtained using telephone and mail follow-up, and non-blinded. This subjectivity is important, because removing IST-3 from the pooled analysis removes the statistical finding of benefit.
Stopping Early
Bias can be compounded in a SRMA when trials are stopped early. That is because larger trials are weighted more heavily in a meta-analysis. So early termination (which reduces trial size) can significantly affect results. Five thrombolytic trials were stopped early for harm or futility [16-20]. Together these would have enrolled more than 2,000 additional subjects who, had they been included, may have neutralized or even reversed findings from the two small trials claiming benefit, NINDS2 and ECASS III (combined n=1,445).
Furthermore, while over 10,000 subjects were enrolled in stroke trials, some individual trials for acute myocardial infarction enrolled far more, and in aggregate those trials included more than 60,000 [20,21]. The comparatively small number of participants in stroke trials means chance findings like baseline imbalances are both more likely and more influential, furthering uncertainty.
Harms
In contrast to the heterogeneous data on the potential benefits, the data on the potential harms are more certain. Exact numbers vary based on definitions and whether one focuses on fatal, symptomatic, or any hemorrhage, but an increase in intracranial hemorrhage is certain. More importantly,
