Epidemiology and Infection

Community-acquired pneumonia and bacteraemia

Classification algorithms to improve the accuracy of identifying patients hospitalized with community-acquired pneumonia using administrative data

O. YUa1 c1, J. C. NELSONa1a2, L. BOUNDSa3 and L. A. JACKSONa3a4

a1 Biostatistics Unit, Group Health Research Institute, Seattle, WA, USA

a2 Department of Biostatistics, University of Washington, Seattle, WA, USA

a3 Group Health Research Institute, Seattle, WA, USA

a4 Department of Epidemiology, University of Washington, Seattle, WA, USA


In epidemiological studies of community-acquired pneumonia (CAP) that utilize administrative data, cases are typically defined by the presence of a pneumonia hospital discharge diagnosis code. However, not all such hospitalizations represent true CAP cases. We identified 3991 hospitalizations during 1997–2005 in a managed care organization, and validated them as CAP or not by reviewing medical records. To improve the accuracy of CAP identification, classification algorithms that incorporated additional administrative information associated with the hospitalization were developed using the classification and regression tree analysis. We found that a pneumonia code designated as the primary discharge diagnosis and duration of hospital stay improved the classification of CAP hospitalizations. Compared to the commonly used method that is based on the presence of a primary discharge diagnosis code of pneumonia alone, these algorithms had higher sensitivity (81–98%) and positive predictive values (82–84%) with only modest decreases in specificity (48–82%) and negative predictive values (75–90%).

(Accepted October 13 2010)

(Online publication November 19 2010)