Hostname: page-component-8448b6f56d-m8qmq Total loading time: 0 Render date: 2024-04-18T11:28:55.224Z Has data issue: false hasContentIssue false

Ordering the suggestions of a spellchecker without using context*

Published online by Cambridge University Press:  01 April 2009

ROGER MITTON*
Affiliation:
School of Computer Science and Information Systems, Birkbeck, University of London, London WC1E 7HX, UK e-mail: R.Mitton@dcs.bbk.ac.uk

Abstract

Having located a misspelling, a spellchecker generally offers some suggestions for the intended word. Even without using context, a spellchecker can draw on various types of information in ordering its suggestions. A series of experiments is described, beginning with a basic corrector that implements a well-known algorithm for reversing single simple errors, and making successive enhancements to take account of substring matches, pronunciation, known error patterns, syllable structure and word frequency. The improvement in the ordering produced by each enhancement is measured on a large corpus of misspellings. The final version is tested on other corpora against a widely used commercial spellchecker and a research prototype.

Type
Papers
Copyright
Copyright © Cambridge University Press 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Angell, R. C., Freund, G. E., and Willett, P. 1983. Automatic spelling correction using a trigram similarity measure. Information Processing and Management 19 (4): 255–61.CrossRefGoogle Scholar
Atkinson, K. 2006. GNU Aspell. http://aspell.net/.Google Scholar
Brill, E., and Moore, R. C. 2000. An improved error model for noisy channel spelling correction. In Proceedings of 38th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Morristown, NJ, pp. 286–93.Google Scholar
Carlson, A. J., Rosen, J., and Roth, D. 2001. Scaling up context-sensitive text correction. In Proceedings of 13th Innovative Applications of Artificial Intelligence Conference, AAAI Press, Menlo Park, CA, pp. 4550.Google Scholar
Damerau, F. J. March 1964. A technique for computer detection and correction of spelling errors. Communications of the A. C. M. 7: 171–6.CrossRefGoogle Scholar
Davidson, L. March 1962. Retrieval of misspelled names in an airlines passenger record system. Communications of the A. C. M. 5: 169–71.CrossRefGoogle Scholar
Deorowicz, S., and Ciura, M. G. 2005. Correcting spelling errors by modelling their causes. International Journal of Applied Mathematics and Computer Science 15 (2): 275–85.Google Scholar
Golding, A. R. 1995. A Bayesian hybrid method for context-sensitive spelling correction. In Proceedings Third Workshop on Very Large Corpora, Massachusetts Institute of Technology, Cambridge, MA, pp. 3953.Google Scholar
Golding, A. R., and Roth, D. 1999. A Winnow-based approach to context-sensitive spelling correction. Machine Learning 34: 107–30.CrossRefGoogle Scholar
Golding, A. R., and Schabes, Y. 1996. Combining trigram-based and feature-based methods for context-sensitive spelling correction. In Proceedings of 34th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Morristown, NJ, pp. 71–78.Google Scholar
Hirst, G., and Budanitsky, A. 2005. Correcting real-word spelling errors by restoring lexical cohesion. Natural Language Engineering 11 (1): 87111.CrossRefGoogle Scholar
Holbrook, D. 1964. English for the Rejected. Cambridge, UK: Cambridge University Press.Google Scholar
Knuth, D. E. 1973. The Art of Computer Programming, Vol. 3: Sorting and Searching, pp. 391–2. Reading, MA: Addison-Wesley.Google Scholar
Kukich, K. 1992. Techniques for automatically correcting words in text. Computing Surveys 24 (4): 377439.CrossRefGoogle Scholar
Levenshtein, V. February 1966. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics – Doklady 10 (8): 707–10.Google Scholar
Mays, E., Damerau, F. J., and Mercer, R. L. 1991. Context based spelling correction. Information Processing and Management 27 (5): 517–22.CrossRefGoogle Scholar
Mihov, S., and Schulz, K. U. 2004. Fast approximate search in large dictionaries. Computational Linguistics 30 (4): 451–77.CrossRefGoogle Scholar
Mitton, R. August 1985. A collection of computer-readable corpora of English spelling errors. Cognitive Neuropsychology 2 (3): 275–9.CrossRefGoogle Scholar
Mitton, R. 1986. A partial dictionary of English in computer-usable form. Literary and Linguistic Computing 1 (4): 214–5.CrossRefGoogle Scholar
Mitton, R. 1987. Spelling checkers, spelling correctors and the misspellings of poor spellers. Information Processing and Management 23 (5): 495505.CrossRefGoogle Scholar
Mitton, R. 1996. English Spelling and the Computer. London: Longman. Available on open access at eprints.bbk.ac.uk.Google Scholar
Odell, M. K., and Russell, R. C. 1918. U. S. Patents 1261167 (1918), 1435663 (1922). U.S. Patent Office.Google Scholar
Oflazer, K. 1996. Error tolerant finite-state recognition with applications to morphological analysis and spelling correction. Computational Linguistics 22 (1): 7389.Google Scholar
Pedler, J. January 2001. Computer spellcheckers and dyslexics – a performance survey. British Journal of Educational Technology 32 (1): 2337.CrossRefGoogle Scholar
Peterson, J. L. December 1980. Computer programs for detecting and correcting spelling errors. Communications of the A. C. M. 23 (12): 676–87.CrossRefGoogle Scholar
Pollock, J. J., and Zamora, A. April 1984. Automatic spelling correction in scientific and scholarly text. Communications of the A. C. M. 27 (4): 358–68.CrossRefGoogle Scholar
Ristad, E. S., and Yianilos, P. N. 1998. Learning string edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (5): 522–32.CrossRefGoogle Scholar
Savary, A. 2002. Typographical nearest-neighbor search in a finite-state lexicon and its application to spelling correction. In Watson, B. W., and Wood, D. (eds.), Proceedings 6th International Conference on the Implementation and Application of Automata, pp. 251–60. Lecture Notes in Computer Science 2494. Berlin: Springer.CrossRefGoogle Scholar
Sun, W., Liu, L.-M., Zhang, W., and Comfort, J. 1992. Intelligent OCR processing. Journal of the American Society for Information Science 43 (6)422–31.3.0.CO;2-Z>CrossRefGoogle Scholar
Toutanova, K., and Moore, R. C. 2002. Pronunciation modelling for improved spelling correction. In Proceedings 40th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Morristown, NJ, pp. 144–51.Google Scholar
Veronis, J. 1988. Computerized correction of phonographic errors. Computers and the Humanities 22 : 4356.CrossRefGoogle Scholar
Wagner, R. A., and Fischer, M. J. January 1974. The string-to-string correction problem. Journal of the A. C. M. 21 (1): 168–73.Google Scholar
Yannakoudakis, E., and Fawthrop, D. 1983. The rules of spelling errors. Information Processing and Management 19 (2): 8799.CrossRefGoogle Scholar