International Journal of Technology Assessment in Health Care



Gerald Gartlehnera1, Suzanne L. Westa2, Alyssa J. Mansfielda2, Charles Poolea3, Elizabeth Tanta4, Linda J. Luxa4 and Kathleen N. Lohra4

a1 Danube University, Krems; RTI International

a2 RTI International

a3 University of North Carolina at Chapel Hill

a4 RTI International


Objectives: The aim of this study was to synthesize best practices for addressing clinical heterogeneity in systematic reviews and health technology assessments (HTAs).

Methods: We abstracted information from guidance documents and methods manuals made available by international organizations that develop systematic reviews and HTAs. We searched PubMed® to identify studies on clinical heterogeneity and subgroup analysis. Two authors independently abstracted and assessed relevant information.

Results: Methods manuals offer various definitions of clinical heterogeneity. In essence, clinical heterogeneity is considered variability in study population characteristics, interventions, and outcomes across studies. It can lead to effect-measure modification or statistical heterogeneity, which is defined as variability in estimated treatment effects beyond what would be expected by random error alone. Clinical and statistical heterogeneity are closely intertwined but they do not have a one-to-one relationship. The presence of statistical heterogeneity does not necessarily indicate that clinical heterogeneity is the causal factor. Methodological heterogeneity, biases, and random error can also cause statistical heterogeneity, alone or in combination with clinical heterogeneity.

Conclusions: Identifying potential modifiers of treatment effects (i.e., effect-measure modifiers) is important for researchers conducting systematic reviews and HTAs. Recognizing clinical heterogeneity and clarifying its implications helps decision makers to identify patients and patient populations who benefit the most, who benefit the least, and who are at greatest risk of experiencing adverse outcomes from a particular intervention.

This research was funded through a contract from the Agency for Healthcare Research and Quality to the RTI International to support the RTI – University of North Carolina Evidence-based Practice Center (EPC) (Contract No. AHRQ Contract No. 290-02-0016). We thank Stephanie Chang, MD, from the Agency for Healthcare Research and Quality, for her guidance for the entire project. Special thanks are due to our EPC colleagues Meera Viswanathan, PhD, and Timothy S. Carey, MD, MPH, respectively the Director and Co-Director of the RTI-UNC Evidence-based Practice Center, for consistent support and guidance. Many thanks also to our colleagues from RTI International Jacqueline Amoozegar, MSPH, Nancy Lenfestey, MHA, and Loraine Monroe for help with data abstraction and word processing. We offer our great appreciation to Mark Helfand, MD, MPH, of the Oregon Health & Science University and Sally Morton, PhD, of the University of Pittsburgh for insightful comments and discussions during the writing stages of the underlying AHRQ report. The authors of this study are responsible for its content. Statements in the study should not be construed as endorsement by the Agency for Healthcare Research and Quality or the U.S. Department of Health and Human Services of a particular drug, device, test, treatment, or other clinical service.