Common errors in data analysis: the apparent error rate of classification rules

D. J. Hand

doi:10.1017/S0033291700050212

Common errors in data analysis: the apparent error rate of classification rules

Published online by Cambridge University Press: 09 July 2009

D. J. Hand

Show author details

D. J. Hand*: Affiliation:
Biometrics Unit, Institute of Psychiatry, London
*: 1Address for correspondence: Dr D. J. Hand, Biometrics Unit, Institute of Psychiatry, Dc Crespigny Park, Denmark Hill, London SE5 8AF.

Article contents

Synopsis
References

Get access

Rights & Permissions

Synopsis

Classification and diagnosis are concepts of fundamental importance in medicine. Yet all too frequently in published papers the only measure of performance of a classification rule is the optimistic apparent error rate. This is defined, some real examples are given illustrating how poor it is as an estimate of true future performance, and alternative measures are suggested.

Type: Brief Communications
Information: Psychological Medicine , Volume 13 , Issue 1 , February 1983 , pp. 201 - 203

DOI: https://doi.org/10.1017/S0033291700050212 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1983

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Dische, S., Yule, W., Corbett, J. & Hand, D. J. (1982). Childhood nocturnal enuresis: factors associated with outcome of treatment with an enuresis alarm. Developmental Medicine and Child Neurology (in the press).Google Scholar

Efron, B. (1979). Bootstrap methods: another look at the jackknife. Annals of Statistics 7, 1–26.CrossRef Google Scholar

Everitt, B. S. (1980). Cluster Analysis (2nd edn). Heinemann Educational Books: London.Google Scholar

Hand, D. J. (1981). Discrimination and Classification. John Wiley and Sons: Chichester.Google Scholar

Hand, D. J. (1982). Kernel Discriminant Analysis. Research Studies Press: Letchworth.Google Scholar

Hand, D. J. (1983 a). A comparison of two methods of discriminant analysis applied to binary data. Biometrics (in the press).CrossRef Google Scholar

Hand, D. J. (1983 b). Leaving one out error estimation in discriminant analysis. In preparation.Google Scholar

Lachenbruch, P. A. (1975). Discriminant Analysis. Hafner Press: New York.Google Scholar

Lachenbruch, P. A. & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics 10, 1–11.CrossRef Google Scholar

McLachlan, G. J. (1980). The efficiency of Efron's ‘Bootstrap’ approach applied to error rate estimation in discriminant analysis. Journal of Statistics and Computer Simulation 11, 273–279.CrossRef Google Scholar

Reading, A. E., Hand, D. J. & Sledmere, C. M. (1982). A comparison of response profiles obtained on the McGill pain questionnaire and an adjective checklist. Pain (submitted).CrossRef Google Scholar

Rogers, W., Ryack, B. & Moeller, G. (1979). Computer-aided medical diagnosis: literature review. International Journal of Bio-Medical Computing 10, 267–289.CrossRef Google Scholar PubMed

Schoolman, H. M. & Bernstein, L. M. (1978). Computer use in diagnosis, prognosis, and therapy. Science 200, 926–931.CrossRef Google Scholar PubMed

Shortliffe, E. H., Buchanan, B. G. & Feigenbaum, E. A. (1979). Knowledge engineering for medical decision making. A review of computer-based clinical decision aids. Proceedings of the Institute of Electrical and Electronics Engineers 67, 1207–1224.CrossRef Google Scholar

Toussaint, G. T. (1974). Bibliography on estimation of misclassification. Institute of Electrical and Electronics Engineers Transactions on Information Theory IT-20, 472–479.CrossRef Google Scholar

Article contents

Common errors in data analysis: the apparent error rate of classification rules

Synopsis

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests