Psychological Medicine

Original Articles

Concordance between personality disorder assessment methods

G. Nestadta1 c1, C. Dia2, J. F. Samuelsa1, Y.-J. Chenga3, O. J. Bienvenua1, I. M. Retia1, P. Costaa4, W. W. Eatona5 and K. Bandeen-Rochea6

a1 Department of Psychiatry and Behavioral Sciences, The Johns Hopkins University School of Medicine, Baltimore, MD, USA

a2 Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

a3 Institute of Statistics, National Tsing Hua University, Hsin-Chu, Taiwan

a4 Gerontology Research Center, National Institute on Aging, Baltimore, MD, USA

a5 Department of Mental Hygiene, The Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA

a6 Department of Biostatistics, The Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA


Background Studies have criticized the low level of agreement between the various methods of personality disorder (PD) assessment. This is an important issue for research and clinical purposes.

Method Seven hundred and forty-two participants in the Hopkins Epidemiology of Personality Disorders Study (HEPS) were assessed on two occasions using the Personality Disorder Schedule (PDS) and the International Personality Disorder Examination (IPDE). The concordance between the two diagnostic methods for all DSM-IV PDs was assessed using standard methods and also two item response analytic approaches designed to take account of measurement error: a latent trait-based approach and a generalized estimating equations (GEE)-based approach, with post-hoc adjustment.

Results Raw criteria counts, using the intraclass correlation coefficient (ICC), κ and odds ratio (OR), showed poor concordance. The more refined statistical methods showed a moderate to moderately high level of concordance between the methods for most PDs studied. Overall, the PDS produced lower prevalences of traits but higher precision of measurement than the IPDE. Specific criteria within each PD showed varying endorsement thresholds and precision for ascertaining the disorder.

Conclusions Concordance in the raw measurement of the individual PD criteria between the two clinical methods is lacking. However, based on two statistical methods that adjust for differential endorsement thresholds and measurement error in the assessments, we deduce that the PD constructs themselves can be measured with a moderate degree of confidence regardless of the clinical approach used. This may suggest that the individual criteria for each PD are, in and of themselves, less specific for diagnosis, but as a group the criteria for each PD usefully identify specific PD constructs.

(Received November 03 2010)

(Revised July 29 2011)

(Accepted July 29 2011)

(Online publication August 24 2011)


c1 Address for correspondence: Dr G. Nestadt, Department of Psychiatry and Behavioral Sciences, Johns Hopkins Hospital, Meyer 113, 600 N. Wolfe Street, Baltimore, MD 21287, USA. (Email: