A general feature space for automatic verb classification

ERIC JOANIS; SUZANNE STEVENSON; DAVID JAMES

doi:10.1017/S135132490600444X

A general feature space for automatic verb classification

Published online by Cambridge University Press: 01 July 2008

ERIC JOANIS ,

SUZANNE STEVENSON and

DAVID JAMES

Show author details

ERIC JOANIS*: Affiliation:
Department of Computer Science, University of Toronto, 6 King's College Road, Toronto, Ontario, Canada, M5S 3H5 e-mail: joanis@cs.toronto.edu, suzanne@cs.toronto.edu, james@cs.toronto.edu
SUZANNE STEVENSON: Affiliation:
Department of Computer Science, University of Toronto, 6 King's College Road, Toronto, Ontario, Canada, M5S 3H5 e-mail: joanis@cs.toronto.edu, suzanne@cs.toronto.edu, james@cs.toronto.edu
DAVID JAMES: Affiliation:
Department of Computer Science, University of Toronto, 6 King's College Road, Toronto, Ontario, Canada, M5S 3H5 e-mail: joanis@cs.toronto.edu, suzanne@cs.toronto.edu, james@cs.toronto.edu
*: †Current affiliation: Interactive Language Technologies Group, Institute for Information Technology, National Research Council Canada, A1330-101 St-Jean-Bosco Street, Gatineau, Quebec, CanadaJ8Y 3G5.

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Lexical semantic classes of verbs play an important role in structuring complex predicate information in a lexicon, thereby avoiding redundancy and enabling generalizations across semantically similar verbs with respect to their usage. Such classes, however, require many person-years of expert effort to create manually, and methods are needed for automatically assigning verbs to appropriate classes. In this work, we develop and evaluate a feature space to support the automatic assignment of verbs into a well-known lexical semantic classification that is frequently used in natural language processing. The feature space is general – applicable to any class distinctions within the target classification; broad – tapping into a variety of semantic features of the classes; and inexpensive – requiring no more than a POS tagger and chunker. We perform experiments using support vector machines (SVMs) with the proposed feature space, demonstrating a reduction in error rate ranging from 48% to 88% over a chance baseline accuracy, across classification tasks of varying difficulty. In particular, we attain performance comparable to or better than that of feature sets manually selected for the particular tasks. Our results show that the approach is generally applicable, and reduces the need for resource-intensive linguistic analysis for each new classification task. We also perform a wide range of experiments to determine the most informative features in the feature space, finding that simple, easily extractable features suffice for good verb classification performance.

Type: Papers
Information: Natural Language Engineering , Volume 14 , Issue 3 , July 2008 , pp. 337 - 367

DOI: https://doi.org/10.1017/S135132490600444X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2006

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abney, S. (1991) Parsing by chunks. In: Berwick, R., Abney, S. and Tenny, C. (eds.), Principle-Based Parsing. Kluwer Academic.Google Scholar

Aone, C. and McKee, D. (1996) Acquiring predicate-argument mapping information in multilingual texts. In: Boguraev, B. and Pustejovsky, J. (eds.), Corpus Processing for Lexical Acquisition, pp. 191–202. MIT Press.Google Scholar

Baker, C. F., Fillmore, C. J. and Lowe, J. B. (1998) The Berkeley FrameNet Project. Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING/ACL-1998), pp. 86–90.Google Scholar

Brent, M. (1993) From grammar to lexicon: Unsupervised learning of lexical syntax. Computational Linguistics, 19 (3): 243–262.Google Scholar

Briscoe, T. and Carroll, J. (1993) Generalised probabilistic LR parsing of natural language (corpora) with unification-based grammars. Computational Linguistics, 19 (1): 25–60.Google Scholar

Briscoe, T. and Carroll, J. (1997) Automatic extraction of subcategorization from corpora. Proceedings of the Fifth ACL Conference on Applied Natural Language Processing (ANLP-97), pp. 356–363, Washington, DC.CrossRef Google Scholar

Burnard, L. (ed.) (2000) British National Corpus User Reference Guide. URL: http://www.natcorp.ox.ac.uk/World/HTML/urg.html.Google Scholar

Chang, C.-C. and Lin, C.-J. (2001) LIBSVM: A library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google Scholar

Dorr, B. J. and Jones, D. (1996) Role of word sense disambiguation in lexical acquisition: Predicting semantics from syntactic cues. Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), pp. 322–327, Copenhagen, Denmark.CrossRef Google Scholar

Dowty, D. R. (1991) Thematic proto-roles and argument selection. Language, 67 (3): 547–619.CrossRef Google Scholar

Erk, K., Melinger, A. and Schulte im Walde, S. (eds.) (2005) Proceedings of the Interdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes. Saarbrücken, Germany.Google Scholar

Gildea, D. (2002) Probabilistic models of verb-argument structure. Proceedings of the 19th International Conference on Computational Linguistics (COLING-2002), pp. 308–314, Taipei, Taiwan.CrossRef Google Scholar

Gildea, D. and Jurafsky, D. (2002) Automatic labeling of semantic roles. Computational Linguistics, 28 (3): 245–288.CrossRef Google Scholar

Girju, R., Roth, D. and Sammons, M. (2005) Token-level disambiguation of verbnet classes. (Erk et al. 2005), pp. 56–61.Google Scholar

Habash, N., Dorr, B. J. and Traum, D. (2003) Hybrid natural language generation from lexical conceptual structures. Machine Translation, 18 (2): 81–128.CrossRef Google Scholar

Hsu, C.-W., Chang, C.-C. and Lin, C.-J. (2003) A practical guide to support vector classification, July. URL: http://www.csie.ntu.edu.tw/~cjlin/libsvm/.Google Scholar

Hsu, C.-W. and Lin, C.-J. (2002) A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13 (2): 415–425.Google Scholar

Iglewicz, B. (1983) Robust scale estimators and confidence intervals for location. In: Hoaglin, D. C., Mosteller, M. and Tukey, J. W. (eds.), Understanding Robust and Exploratory Data Analysis. Wiley.Google Scholar

Joanis, E. (2002) Automatic verb classification using a general feature space. Master's thesis, Department of Computer Science, University of Toronto.Google Scholar

Joanis, E. and Stevenson, S. (2003) A general feature space for automatic verb classification. Proceedings of the Tenth Conference of the European Chapter of the Association for Computational Linguistics (EACL-03), pp. 163–170, Budapest, Hungary.CrossRef Google Scholar

Kipper, K., Dang, H. T. and Palmer, M. (2000) Class based construction of a verb lexicon. Proceedings of the 17th National Conference on Artificial Intelligence (AAAI-2000), Austin, TX.Google Scholar

Kipper, K., Korhonen, A., Ryant, N. and Palmer, M. (2006) A large-scale extension of VerbNet with novel verb classes. Proceedings of the 12th EURALEX International Congress, Turin, Italy.Google Scholar

Korhonen, A. (2002) Semantically motivated subcategorization acquisition. Proceedings of the ACL Workshop on Unsupervised Lexical Acquisition, pp. 51–58.CrossRef Google Scholar

Korhonen, A. and Briscoe, T. (2004) Extended lexical-semantic classification of english verbs. Proceedings of the HLT/NAACL Workshop on Computational Lexical Semantics, pp. 38–45.CrossRef Google Scholar

Lapata, M. and Brew, C. (1999) Using subcategorization to resolve verb class ambiguity. In: Fung, P. and Zhou, J. (eds.), Joint SIGDAT Conference on Empirical Methods in NLP and Very Large Corpora (EMNLP/VLC-99), pp. 266–274.Google Scholar

Lapata, M. and Brew, C. (2004) Verb class disambiguation using informative priors. Computational Linguistics, 30 (1): 45–73.CrossRef Google Scholar

Levin, B. (1993) English verb classes and alternations: A preliminary investigation. University of Chicago Press.Google Scholar

Mayol, L., Boleda, G. and Badia, T. (2005) Automatic learning of syntactic verb classes. (Erk et al. 2005), pp. 92–97.Google Scholar

McCarthy, D. (2000) Using semantic preferences to identify verbal participation in role switching alternations. Proceedings of the First Conference of the North American Chapter of the ACL (NAACL-2000), pp. 256–263, Seattle, WA.Google Scholar

McCarthy, D., Koeling, R., Weeds, J. and Carroll, J. (2004) Finding predominant senses in untagged text. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-2004), pp. 280–287, Barcelona, Spain.CrossRef Google Scholar

Merlo, P. and Stevenson, S. (2001) Automatic verb classification based on statistical distributions of argument structure. Computational Linguistics, 27 (3): 373–408.CrossRef Google Scholar

Merlo, P., Stevenson, S., Tsang, V. and Allaria, G. (2002) A multilingual paradigm for automatic verb classification. Proceedings of the 40th Annual Meeting of the ACL, pp. 207–214, Philadelphia, PA.CrossRef Google Scholar

Oishi, A. and Matsumoto, Y. (1997) Detecting the organization of semantic subclasses of Japanese verbs. Int. J. Corpus Linguistics, 2 (1): 65–89.CrossRef Google Scholar

Palmer, M., Gildea, D. and Kingsbury, P. (2005) The Proposition Bank: An annotated corpus of semantic roles. Computational Linguistics, 31 (1): 71–106.CrossRef Google Scholar

Pinker, S. (1989) Learnability and cognition: the acquisition of argument structure. MIT Press.Google Scholar

Resnik, P. (1996) Selectional constraints: an information-theoretic model and its computational realization. Cognition, 61 (1–2): 127–159.CrossRef Google Scholar PubMed

Rifkin, R. and Klautau, A. (2004) In defense of one-vs-all classification. J. Machine Learning Res. 5 (Jan): 101–141.Google Scholar

Riloff, E. and Schmelzenbach, M. (1998) An empirical approach to conceptual case frame acquisition. Proceedings of the Sixth Workshop on Very Large Corpora (WVLC-98), pp. 49–56, Montreal, Canada.Google Scholar

Rohde, D. L. T. (2002) TGrep2 user manual version 1.3. Available with the TGrep2 package at http://tedlab.mit.edu/~dr/Tgrep2/.Google Scholar

Rooth, M., Riezler, S., Prescher, D., Carroll, G. and Beil, F. (1999) Inducing a semantically annotated lexicon via EM-based clustering. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 104–111, College Park, MD.CrossRef Google Scholar

Sarkar, A. and Tripasai, W. (2002) Learning verb argument structure from minimally annotated corpora. Proceedings of the 19th International Conference on Computational Linguistics (COLING-2002), pp. 864–869, Taipei, Taiwan.CrossRef Google Scholar

Sarle, W. S. (2002) Should I nonlinearly transform the data? Neural Network FAQ, part 2 of 7: Learning. Periodic posting to the Usenet newsgroup comp.ai.neural-nets, URL: ftp://ftp.sas.com/pub/neural/FAQ.html.Google Scholar

Schulte im Walde, S. (2000) Clustering verbs semantically according to their alternation behaviour. Proceedings of the 18th International Conference on Computational Linguistics (COLING-2000), pp. 747–753, Saarbrücken, Germany.CrossRef Google Scholar

Schulte im Walde, S. (2003) Experiments on the choice of features for learning verb classes. Proceedings of the Tenth Conference of the European Chapter of the Association for Computational Linguistics (EACL-2003), pp. 315–322, Budapest, Hungary.CrossRef Google Scholar

Schulte im Walde, S. and Brew, C. (2002) Inducing German semantic verb classes from purely syntactic subcategorisation information. Proceedings of the 40th Annual Meeting of the ACL, pp. 223–230, Philadelphia, PA.CrossRef Google Scholar

Shi, L. and Mihalcea, R. (2005) Putting pieces together: Combining FrameNet, VerbNet and WordNet for robust semantic parsing. In: Gelbukh, A. (ed.), Computational Linguistics and Intelligent Text Processing; Sixth International Conference, CICLing 2005, Proceedings, Lecture Notes in Computer Science, vol 3406, pp. 100–111, Mexico City, Mexico.Google Scholar

Stevenson, S. and Joanis, E. (2003) Semi-supervised verb class discovery using noisy features. Proceedings of the Seventh Conference on Natural Language Learning (CoNLL-2003), pp. 71–78, Edmonton, Canada.CrossRef Google Scholar

Stevenson, S. and Merlo, P. (1999) Automatic verb classification using grammatical features. Proceedings of the Ninth Conference of the European Chapter of the Association for Computational Linguistics (EACL-99), pp. 45–52, Bergen, Norway.Google Scholar

Stevenson, S., Merlo, P., Kariaeva, N. and Whitehouse, K. (1999) Supervised learning of lexical semantic verb classes using frequency distributions. Proceedings of SigLex99: Standardizing Lexical Resources, pp. 15–22, College Park, MD.Google Scholar

Swier, R. and Stevenson, S. (2004) Unsupervised semantic role labelling. Proceedings of the 2004 Conference on Emperical Methods in Natural Language Processing, pp. 95–102, Barcelona, Spain.Google Scholar

Swift, M. (2005) Towards automatic verb acquisition from VerbNet for spoken dialog processing. (Erk et al. 2005), pp. 115–120.Google Scholar

Tsang, V. and Stevenson, S. (2004) Calculating semantic distance between word sense probability distributions. Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004), pp. 81–88, Boston, MA.Google Scholar

Villavicencio, A. (2005) The availability of verb-particle constructions in lexical resources: How much is enough? Computer Speech and Language, Special Issue on Multiword Expressions, 19 (4): 415–432.CrossRef Google Scholar

Article contents

A general feature space for automatic verb classification

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests