Friday, September 25, 2009

Why Size DOES Matter in Pain Research Trials

EBPM Logo Understanding Evidence-Based Pain Management (EBPM)

Published research literature in the pain management field is inundated with reports of clinical trials having both positive and negative outcomes that enroll too few subjects to yield valid results. This wrongly depicts the therapies under investigation and biases the base of evidence for guiding clinical practice. Two recently-published studies exemplify some of the problems with small trials.

Study of PIRFT for Lower Back Pain Aborted Prematurely
Investigators in Norway conducted a randomized, sham-procedure controlled trial to examine an interventional treatment for chronic low back pain due to intervertebral disc degeneration [Kvarstein et al. 2009]. The treatment involved insertion of a probe (called the discTRODE™) into the outer part of an affected disc, the annulus, and applying radiofrequency (RF) current to heat and expectedly heal the disrupted area. In general, the procedure is known as percutaneous intradiscal radiofrequency thermocoagulation (PIRFT).

Twenty eligible patients were randomized to either intra-annular PIRFT or a sham treatment, which followed identical procedures except no heat was applied via the probe. At 6 months, an interim analysis did not reveal any significant differences between active and sham treatments for the primary endpoint — change in pain intensity (on a 0–10 scale) — and the further inclusion of patients in the trial was discontinued. After 12 months, overall reductions in pain from baseline levels reached statistical significance; however, there were no differences between the two groups and 20% of patients in each group actually fared worse. Based on these results, the authors concluded that they would not recommend intra-annular PIRFT using the discTRODE probe.

There are some concerns about the size of this study that raise questions about its validity. For one thing, patient inclusion was extremely selective: only 74 patients were recruited for the study out of 700 referrals, and merely 20 were treated (10 in each group). Yet, the authors had determined in advance that they would need at least 25 subjects in each group to have an 80% probability of detecting a 2-point statistically significant (at p=0.05) improvement in pain scores. The 80% number is called “statistical power,” which researchers can calculate in advance, and it is the likelihood that a significant difference between groups will be found if such a difference does exist. Even with 80% power, there is a 20% chance that important differences will be overlooked. Since this study did not enroll sufficient numbers of patients there was no power to detect significant differences and we cannot conclude that the outcomes were reliable evidence of PIRFT failure.

In an editorial accompanying the study, van Kleef and Kessels [2009] suggest that stopping the study after the inclusion of only 20 patients was unacceptable. They observe that all outcomes did show positive trends for PIRFT, although these were not statistically significant with so few subjects. It is understandable that the researchers were reluctant to continue the study — which administered to half the patients an invasive sham treatment with inherent risks — when favorable results seemed doubtful. However, there might be ethical questions regarding involving patients receiving either active and sham treatments in a study that is underpowered to produce valid outcomes and therefore provides little of value for the pain management field on the subject.

TENS Trial for Neuropathic Pain Fails to Show Significant Benefits
Another recently-published trial assessed the short-term benefits of high-frequency (HF) versus low-frequency (LF) transcutaneous electrical nerve stimulation (TENS) for neuropathic pain following spinal cord injury [Norrbrink 2009]. A total of 24 patients participated in a crossover trial design, with half of the patients randomly assigned to 2 weeks of 3-times-daily HF (80 Hz) TENS therapy and the other half to an identical regimen but using LF (burst of 2 Hz) TENS. After a 2-week wash-out period, patients switched stimulation frequencies and repeated the 2-week treatment procedure. On a group level, no significant changes were found from baseline or between HF and LF on pain intensity ratings (the primary outcome), or on ratings of mood, coping with pain, life satisfaction, sleep quality, or psychosocial consequences of pain. However, on a subjective 5-category global pain-relief scale a favorable effect was reported by 29% of individual patients for HF stimulation and by 38% for LF stimulation; this seems promising but statistically significant differences were not reported and there was no control group for comparison.

The author of this study did not indicate having done a statistical power analysis; however, according to our calculations, with only 24 subjects enrolled (48 in each group with the cross-over design), the trial was underpowered to detect less than a 22% difference between HR and LF treatment effects as being statistically significant. Another problem with small studies is that any dropouts greatly weaken the pool of available data for proper analysis, and 9 of the patients (38%) did not complete this study. Still, the researcher concluded that TENS merits consideration as a complementary treatment in patients with spinal chord injuries and neuropathic pain.

Underpowered Trials: More Harm Than Good?
The accurate reporting of clinical trials of experimental therapies — whether the outcomes are favorable or unfavorable — can be critical for establishing a valid and trustworthy base of evidence. However, the value of conducting and reporting small, underpowered studies must be questioned. Although such investigations, as pilot studies, might be helpful for designing more extensive future research, reporting them in the literature may prematurely skew or bias perceptions of the treatments in question. This is especially concerning when such studies end up eventually being included in review articles, meta-analyses, or guidelines.

Based on their small study, Kvarstein and colleagues disclaim any benefits of PIRFT using the discTRODE probe when a much larger study might have shown just the opposite. And, while Norrbrink proposes that TENS merits consideration for treating neuropathic pain following spinal cord injury, the data she presents from only 15 trial completers do not significantly support such a conclusion. Meanwhile, based on these studies healthcare insurance plans are likely to deny reimbursement for either therapy, and any future research in support of these therapies will need to overcome the negative perceptions to qualify the treatments for payment. Granted, larger and adequately powered trials may be more costly and take longer to do, but at least their outcomes can be trusted as being of some significant consequence.

> Kvarstein G, Mawe L, Indahl A, et al. A randomized double-blind controlled trial of intra-annular radiofrequency thermal disc therapy – a 12-month follow-up. Pain. 2009(Oct);145(3):279-286 [See
> Norrbrink C. Transcutaneous electrical nerve stimulation for treatment of spinal cord injury neuropathic pain. J Rehab Res Dev (JRRD). 2009;46(1):85-94. [Article PDF available
> van Kleef M, Kessels AGH. Underpowered clinical trials: time for a change. Pain. 2009(Oct);145(3):265-266.