Friday, March 25, 2011

Pain Research: All That Glitters is Not Gold

Research Sense2
Part 2 – A Hierarchy of Evidence
Just because a pain research study is published does not mean it is accurate, unbiased, valid, or useful for any clinical or decision making purpose. The truth is that much pain-related research literature is simply not worth reading, and sifting out the golden nuggets of worthwhile research from fool’s gold can be a challenging task for any healthcare provider or patient.

The above applies no matter how prestigious the journal, how rigorous the peer-review process, or how gleaming the reputations of the researchers. In fairness, however, it must be acknowledge that it is much easier to criticize research than it is to actually do research. And, even with its limitations and shortcomings, research in pain management provides the best hope of finding more effective treatments for improved patient care. The danger is that faulty or deficient research may come to overshadow the good, resulting in strong treatment recommendations based on weak evidence, as has happened in some guidelines in the pain field [previously discussed in a “Pain-Topics e-Briefing” here].

The Role of Evidence-Based Pain Management

Beginning in the early 1990s, there has been an intensive movement worldwide to adopt principles of “evidence-based medicine,” or EBM, in all healthcare disciplines. Such efforts are directed to the needs of busy clinicians and staff, enabling them to critically interpret research rather than accepting at face value what is presented to them in the literature or at conferences. The philosophical origins of EBM date back to at least the mid 19th century and may be defined as, “the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients” [Guyatt et al. 2002; Sackett et al. 1997].

Applied to pain management, EBM involves combining clinical expertise and experience with the best available external evidence on a topic of concern gathered from various sources. EBM approaches empower pain care providers to clearly differentiate between clinical practices based on sound evidence versus those founded more on traditional practices, long-standing prejudices, or medical rationales and approaches that might be outdated or incorrect [Oxman et al. 1993].

Within an EBM framework, planning, conducting, assessing, and writing about medical research is a whole discipline unto itself, and the literature in the EBM field is vast. [Those interested in further resources could visit] However, while the typical consumer of pain research — healthcare provider or patient — has little time to become expert in such matters, they do need to know how to evaluate the accuracy, reliability, and validity of research for everyday clinical practice, and this requires some study.

A Hierarchy of Pain Research Evidence

As a start along the EBM path, various types of research studies assessing or discussing clinical pain treatment effects may be ranked according to a “hierarchy of evidence.” This is based on the relative strength of each type of study for providing results that are likely to be free of bias and useful to healthcare providers and their patients. Rankings from weakest at the bottom to strongest at the top are depicted in the Table.

Evidence HierarchyHierarchy rankings do not question the ability of any research approach to be valid and of value for a particular purpose. However, each type of study has its limitations and the rankings recognize that certain forms of evidence may be given greater emphasis for guiding clinical decision-making. Pain treatment guidelines developers usually grade the quality of evidence from weakest at the bottom of the pyramid to stronger toward the top. Following is a brief description of each type of evidence, from lowest to highest ranking, including their strengths and weaknesses [refs. in Leavitt 2003]:

“Perspectives” is a coined term to represent narrative overviews or reviews, commentary or consensus statements, interviews, and editorials. These are the most common types of communications in the pain field and, unfortunately, are sometimes cited as valid evidence when they are not. Perspectives are at the bottom of the research evidence hierarchy because they so often represent personal opinion and/or summarize or comment on research that was done by others, rather than presenting new data or conclusions gleaned from original clinical investigations of some sort. Conference presentations usually offer perspectives, no matter how authoritative or well-researched the contents seem to be.

Perspectives are highly subject to the bias of the writer (or speaker), such as favoring one viewpoint over another that may or may not be supported adequately by research evidence. However, these communications also can be invaluable sources of information by consolidating existing research and offering interpretations to aid understanding and further inquiry — as long as the reader or listener keeps the potential biases in mind.
Case Reports/Studies
Also called case histories, case series, or anecdotes, these draw from personal observations or medical records reviews to report unusual or unexpected events in conjunction with a medication, therapy, or intervention of some sort. There can be strong biases associated with such reports, including errors in observation, inadequate data collection, unknown contributing factors, and unsupported conclusions.

Reporting on individual patient cases is clearly anecdotal, largely hearsay, evidence. Even a “case series” report is a collection of related anecdotes that may be interesting but cannot be trusted without broader verification. Small-scale investigations, enrolling few participants and sometimes called “pilot studies” might be viewed merely as a case series or collection of anecdotes, and poorly conducted observational studies or surveys also may have qualities of a case series.

As Alan Leshner, PhD, former Director of the U.S. National Institute on Drug Abuse, frequently stressed, “The plural of anecdote is not evidence.” Unfortunately, case reports and other studies that are essentially collections of anecdotes sometimes are given more credence than they deserve; there have been instances where drugs have been burdened with “black box” warnings or pulled off the market entirely based on such evidence without more rigorous confirmation. [For example, this may have been the case with propoxyphene in which only 18 subjects were enrolled in a pivotal study leading to its removal from marketing — discussed in an UPDATE here.]
Cross-Sectional Studies
These investigations — also called prevalence or epidemiological studies — examine relationships between medical conditions, treatments, and other variables of interest as they existed in a defined population during a particular period of time; that is, they are retrospective. Researchers look at interventions or exposures (what occurred in the patients) and outcome conditions (what happened). In some cases, cross-sectional studies may be prospective, looking forward during a pre-determined period of time to observe what happens as a result of a therapy or intervention among select groups of patients.

This type of study can establish associations but not causality regarding why an effect or event happened. There may be problems with recall bias in the case of retrospective investigations (not remembering exactly what occurred) and unknown extraneous factors (confounders) that are unequally distributed among subjects. Additionally, there are 3 other concerns with cross-sectional pain research studies:
  1. Many of these studies rely on Data Mining. This involves analyzing existing data from different viewpoints to reveal new patterns, trends, or correlations of interest. The approach relies on access to large repositories of patient data — such as from government agencies, health insurance plans, or networks of electronic health records — and its value for the pain management field must be cautiously considered.

    By conducting a great number of analyses on the same data (easy to do with computers) it is almost certain that some statistically significant associations will be found; this is sometimes also called “data dredging.” Then, the researchers can pick those that most closely support their hypotheses, or purpose for the study, but the conclusions may reflect bias and not be valid. [Concerns and examples regarding this were previously discussed in an UPDATE here.]

  2. A second concern is Probability Blindness. Associations between variables, or risks, are calculated as probabilities; that is, a ratio expressed as the number of particular events or effects that occur (the numerator) divided by the total number of potential event or effect exposures (the denominator) during a specific period of time. For example, the number of recorded adverse events with an analgesic (numerator) divided by the total number of exposures to the drug (denominator) equals the probability, or risk in this case, of the event occurring during the timeframe of observation — such as, “20%, or 1 of every 5 persons, taking X-drug experienced nausea during the first week.”

    However, the denominator can be made larger or smaller depending on how potential exposures are counted. In the case of an analgesic, for example, is the denominator the total number of patients prescribed the drug, the total number of prescriptions written, or the number of total doses taken? The selection of a denominator can be subjective and will skew the probability size one way or the other leading to what has been called “probability blindness.” [This is more extensively discussed in an UPDATE here.]

  3. Finally, the problem of proper Definitions can be a serious confounding factor. As alluded to immediately above, how events, effects, exposures, etc. are defined can make a significant difference, and also make comparisons across studies inappropriate. For example, many cross-sectional studies (and also longitudinal studies described below) have examined prevalence rates of drug misuse, abuse, and addiction in persons prescribed opioid analgesics. However, there often are important differences in how those adverse events are defined and, thereby, counted.

    Opioid “misuse,” for example, may describe overuse and/or underuse for medical purposes, nonmedical use, diversion, etc. and it might be a one-time occurrence or more frequent — there is little clarity or consistency across studies in how this variable is defined and measured. Consequently, the prevalence rate of opioid misuse can be expressed as a large or small probability depending on the biases of the authors. This same phenomenon occurs with many other variables studied in pain management and can be very misleading to consumers of research.
Case-Control Studies
Also called case-referent, case-comparison, or retrospective studies, these identify patients with the outcome(s) of interest (Cases) and Control patients without the same outcome(s). The researchers then look back in time to compare how many subjects in each group had the same interventions or exposures of interest. This is a relatively fast and inexpensive approach, often relying on “data mining” approaches, and it may be the only feasible way of examining treatment effects or other outcomes with long lag times between interventions and outcomes. A problem is that this method is highly subject to recall bias and/or inconsistent records for determining what occurred in the past. [For example, this was a problem of concern in a study of birth defects in the newborns of pregnant women who had taken opioid analgesics, discussed in an UPDATE here.]
Cohort Studies
Cohort studies — also sometimes called, observational, followup, incidence, longitudinal, or prospective trials — usually involve two or more groups of patients (cohorts) who either receive the treatment of interest (Experimental group) or do not (Controls). The groups are followed forward in time to observe outcomes of interest that differ between groups. Subjects are not randomly assigned to groups, although it is important that the groups are as evenly matched as possible, and there can be difficulties in identifying Control patients similar to those treated in the Experimental group or lack of a suitable Control group entirely. Also, treatment effects may be linked to unknown or uncontrolled factors (confounders).

A single group may be involved in a “cross-over” design, in which the same subjects are first assigned to one of the groups (Experimental or Control) and then switched to the other condition after a prespecified time period. Each Experimental group subject also serves as their own Control so fewer persons need to be recruited; however, there can be many source of bias and error in this design.

Some cohort studies, often labeled as “pragmatic trials,” designate the recruited subjects as representing either a “convenience, naturalistic, or opportunistic” sample — which may be a euphemism for “we signed up any patients who were available and willing to participate.” Outcome effects in cohort studies of this sort are highly subject to influence by unknown or uncontrolled factors (confounders) and the quality of these investigations can vary widely; hence, pragmatic cohort studies are relegated to lower status in the hierarchy [Zwarenstein et al. 2008].
Randomized Controlled Clinical Trials (RCTs)
RCTs are considered by many as the “gold standard” when addressing questions of treatment efficacy. In this design, patients are carefully selected and then randomly assigned to either an Experimental or Control group and followed prospectively (forward in time) to observe outcomes of interest. Ideally, the groups are equally matched demographically (eg, age, sex, etc.) and any extraneous or unknown factors (confounders) are assumed to be equally distributed across groups.

Unfortunately, this type of study can be the most costly in terms of time and money. Furthermore, in pain medicine, there may be ethical problems with Control conditions — such as denial of treatment, or inadequate or sham treatments resulting in adverse outcomes — and there may be volunteer bias in terms of the characteristics of patients who are willing to be randomly assigned to experimental or control procedures for treating their pain conditions. There are 3 further concerns with RCTs in the pain management field:
  1. In the best RCTs, both subjects and practitioners are “blinded” as to whether the experimental or comparison treatment (eg, placebo) is being administered. This can be difficult in pain research, as both subjects and practitioners often develop suspicions as to which group they are in, which can bias responses. In the case of interventional therapies, it is usually impossible to blind practitioners.

  2. In placebo-controlled trials it can be difficult to develop a true placebo or inert condition when it comes to drugs and very challenging when it comes to interventional therapies. For example, sham acupuncture, applying needles in the “wrong places,” has actually demonstrated effects comparable to what was considered “true” needling. And, in one trial, a placebo therapy for irritable bowel syndrome was effective even when subjects were told they were in the placebo group [see UPDATE here].

  3. Irregardless of the veracity of placebos used in pain research, a potent confounding factor is the well-known “placebo effect,” which occurs due to the mere act of a subject participating in a clinical trial and being treated in some fashion. This affects both experimental treatment and control groups of subjects. Ideally, placebo effects would be randomly distributed across groups, but this is always a potentially confounding influence. [Placebo effects are discussed in an UPDATE here.]
In many cases, factors that are known or suspected of possibly having confounding effects in RCTs — eg, age, sex, severity of pain condition, concurrent therapies, etc. — may be controlled via sophisticated statistical manipulations. However, the more arduous and complex the statistical schemes employed the less the results can be trusted as being accurate and valid.
Systematic Reviews & Meta-analyses
Systematic reviews gather all evidence of the highest quality available to address clearly-focused clinical questions. Clinical practice guidelines often result from a systematic review process that can be quite elaborate, with explicitly defined methods for gathering, assessing, and accepting research evidence for inclusion. Conclusions tend to be reliable and accurate IF there is an abundance of high quality evidence available, which often is not the case.

Systematic reviews facilitate the relatively rapid assimilation of large amounts of research by readers. However, critics have expressed concerns about the validity of combining studies that were done on different patient populations, in different settings, at different times, and sometimes for different reasons. Another limitation is the search procedure used to identify studies for inclusion. Commonly used electronic databases, such as MEDLINE and EMBASE, are convenient but usually do not include all studies that may be relevant and important. Plus, there is a publication bias in the pain field whereby “negative trials,” those that fail to demonstrate expected outcomes, are never published.

Meta-analyses take systematic reviews a step further by combining data evidence from multiple investigations — almost always RCTs — and using statistical techniques to analyze the results and reach conclusions. Hence, these are research projects in which the unit of analysis becomes the individual study rather than individual subjects. This approach allows for achieving greater precision and clinical applicability of results than is possible with any individual study or systematic review, which is why they are at the very top of the evidence pyramid.

The pain management field is certainly not lacking in published studies; unfortunately, most of them cluster more prevalently toward the bottom of the evidence hierarchy. Arguments have been made that the majority of research investigations in any medical field produce results that are questionable, or false in many cases, and that a pure “gold standard” of evidence quality is unattainable [See Freedman 2010; Ioannidis 2005 — and, this will be discussed further in the next UPDATE in this series]. Any single research investigation, no matter how comprehensive or methodologically sound, provides only a partial picture of what has been discovered or is yet to be revealed for better pain management.

In all of the study types, patient selection and inclusion is a most critical process, since this can bias outcomes and affect external validity — that is, generalization to a broader, typical patient population. Consumers of research often overlook the fact that qualities of patients selected as subjects in particular studies may not apply to their clinical circumstances, either as healthcare providers or as interested patients with pain. To be truly useful, published evidence (as well as continuing medical education or conference presentations) must satisfy essential questions of relevance, such as:
  • Overall, does the study or presentation reflect high quality, valid evidence?

  • Are the patients/subjects being examined in the research similar to your own patients?

  • Do the questions (hypotheses) addressed pertain to your patients’ pain management needs?

  • Is the research gathering and analysis process free of bias and clearly explained?

  • Are the results understandable and statistically significant?

  • Do conclusions make sense from patient-benefit perspectives — clinically significant?
Above all, clinical research reports in pain management should satisfy the last question by helping to define best practices, with important benefits for patients outweighing any disadvantages. However, keep in mind that worthwhile research questions do not always have simple and clear answers.

To be alerted by e-mail of when further UPDATES articles in this series are published, register [here] to receive once-weekly Pain-Topics “e-Notifications.”
> Freedman, David H. Lies, Damned Lies, and Medical Science. The Atlantic [online]. 2010(Nov) [
available here].
> Guyatt G, Rennie D (eds). Users’ Guides to the Medical Literature: A Manual for Evidence-Based Clinical Practice. Chicago, IL: AMA Press; 2002.
> Ioannidis JPA. Why most published research findings are false. PLoS Medicine. 2005;2(8):e124 [
available here].
> Leavitt SB. Can Pain Medicine Research Be Trusted? Pain-Topics e-Briefing. 2008;3(1):1-5 [
PDF here].
> Leavitt SB. EBAM (Evidence-Based Addiction Medicine) for Practitioners. Addiction Treatment Forum. March 2003 [
PDF available here].
> Oxman AD, Sackett DL, Guyatt GH. Users’ guides to the medical literature: I. How to get started. JAMA. 1993;270(17):2093-2095 [
article here].
> Sackett DL, Richardson WS, Rosenberg W, Haynes RB. Evidence-Based Medicine: How to Practice & Teach EBM. New York, NY: Churchill Livingstone; 1997.
> Zwarenstein M, Treweek S, Gagnier J, et al. Improving the reporting of pragmatic trials: an extension of the CONSORT statement. BMJ. 2008(Dec 20);337:a2390 [
abstract here].