REVIEWING THE REVIEWS
(How Strong Is the Evidence? How Clear Are the Conclusions?)
Objectives: The objectives of this paper were: a) to determine what can be learned from conclusions of systematic reviews about the evidence base of medicine; and b) to determine whether two readers draw similar conclusions from the same review, and whether these match the authors' conclusions.
Methods: Three methodologists (two per review) rated 160 Cochrane systematic reviews (issue 1, 1998) using pre-established conclusion categories. Disagreements were resolved by discussion to arrive at a consensual score for each review. Reviews' authors were asked to use the same categories to designate the intended conclusion. Interrater agreements were calculated.
Results: Interrater agreement between two readers was 0.68 and 0.72, and between readers and authors, 0.32. The largest categories assigned by methodologists were “positive effect” (22.5%), “insufficient evidence” (21.3%), and “evidence of no effect” (20.0%). The largest categories assigned by authors were “insufficient evidence” (32.4%), “possibly positive” (28.6%), and “positive effect” (26.7%).
Conclusions: The number of reviews indicating that the modern biomedical interventions show either no effect or insufficient evidence is surprisingly high. Intterrater disagreements suggest a surprising degree of subjective interpretation involved in systematic reviews. Where patterns of disagreement emerged between authors and readers, authors tended to be more optimistic in their conclusions than the readers. Policy implications are discussed.
Key Words: Evidence-based medicine; Review literature; Outcomes assessment (health care); Meta-analysis; Randomized controlled trials.
c1 The authors express appreciation to Diane Nickols for editorial assistance and to the many authors of Cochrane reviews for their commitment and diligence and for participating in this study. This project was funded by NIH grant no. 1 R21-RR09327-01.