Hostname: page-component-7c8c6479df-5xszh Total loading time: 0 Render date: 2024-03-27T00:57:55.633Z Has data issue: false hasContentIssue false

Subjectivity detection in spoken and written conversations

Published online by Cambridge University Press:  09 December 2010

GABRIEL MURRAY
Affiliation:
Department of Computer Science, 201-2366 Main Mall, University of British Columbia Vancouver, CanadaV6T 1Z4 emails: gabrielm@cs.ubc.ca, carenini@cs.ubc.ca
GIUSEPPE CARENINI
Affiliation:
Department of Computer Science, 201-2366 Main Mall, University of British Columbia Vancouver, CanadaV6T 1Z4 emails: gabrielm@cs.ubc.ca, carenini@cs.ubc.ca

Abstract

In this work we investigate four subjectivity and polarity tasks on spoken and written conversations. We implement and compare several pattern-based subjectivity detection approaches, including a novel technique wherein subjective patterns are learned from both labeled and unlabeled data, using n-gram word sequences with varying levels of lexical instantiation. We compare the use of these learned patterns with an alternative approach of using a very large set of raw pattern features. We also investigate how these pattern-based approaches can be supplemented and improved with features relating to conversation structure. Experimenting with meeting speech and email threads, we find that our novel systems incorporating varying instantiation patterns and conversation features outperform state-of-the-art systems despite having no recourse to domain-specific features such as prosodic cues and email headers. In some cases, such as when working with noisy speech recognizer output, a small set of well-motivated conversation features performs as well as a very large set of raw patterns.

Type
Articles
Copyright
Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Baron, N. 2000. Alphabet to Email: How Written English Evolved and Where it's Heading. New York, NY: Routledge (Taylor & Francis).Google Scholar
Biadsy, F., Hirschberg, J., and Filatova, E. 2008. An unsupervised approach to biography production using wikipedia. In Proceedings of Acl-Hlt 2008, Columbus, OH.Google Scholar
Brill, E. 1992. A simple rule-based part of speech tagger. In Proceedings of Darpa Speech and Natural Language Workshop, San Mateo, CA, pp. 112116.Google Scholar
Carenini, G., Ng, R., and Zhou, X. 2007. Summarizing email conversations with clue words. In Proceedings of Acm www 07, Banff, Canada.Google Scholar
Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., and Wellner, P. 2005. The {AMI} meeting corpus: A pre-announcement. In Proceedings of Mlmi 2005, Edinburgh, UK, pp. 2839.Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. 2008. Liblinear: A library for large linear classification. Journal of Machine Learning Research 9: 18711874.Google Scholar
Fawcett, T. 2003. Roc graphs: Notes and practical considerations for researchers. Technical Report HP Labs HPL-2003–4.Google Scholar
Germesin, S., Becker, T., and Poller, P. 2008. Hybrid multi-step disfluency detection. In Proceedings of Mlmi 2008, Utrecht, The Netherlands, pp. 185195.Google Scholar
Hain, T., Burget, L., Dines, J., Garau, G., Wan, V., Karafiat, M., Vepa, J., and Lincoln, M. 2007. The AMI system for transcription of speech in meetings. In Proceedings of Icassp 2007, pp. 357–360.Google Scholar
Murray, G., and Carenini, G. 2008. Summarizing spoken and written conversations. In Proceedings of Emnlp 2008, Honolulu, HI, USA.Google Scholar
Murray, G., Kleinbauer, T., Poller, P., Renals, S., Becker, T., and Kilgour, J. 2008. Extrinsic summarization evaluation: A decision audit task. In Proceedings of Mlmi 2008, Utrecht, The Netherlands.Google Scholar
Pang, B., and Lee, L. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 1–2 (2): 1135.Google Scholar
Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. 1985. A Comprehensive Grammar of the English Language. New York, NY: Longman.Google Scholar
Raaijmakers, S., Truong, K., and Wilson, T. 2008. Multimodal subjectivity analysis of multiparty conversation. In Proceedings. of Emnlp 2008, Honolulu, HI.Google Scholar
Riloff, E. 1996. Automatically generating extraction patterns from untagged text. In Proceedings of Aaai 1996, Portland, OR, pp. 10441049.Google Scholar
Riloff, E., and Phillips, W. 2004. An introduction to the sundance and autoslog systems. Technical Report UUCS-04-015, University of Utah School of Computing.Google Scholar
Riloff, E., and Wiebe, J. 2003. Learning extraction patterns for subjective expressions. In Proceedings of Emnlp 2003, Sapporo, Japan.Google Scholar
Riloff, E., Patwardhan, S., and Wiebe, J. 2006. Feature subsumption for opinion analysis. In Proceedings of Emnlp 2006, Sydney, Australia.Google Scholar
Somasundaran, S., Ruppenhofer, J., and Wiebe, J. 2007. Detecting arguing and sentiment in meetings. In Proceedings of Sigdial 2007, Antwerp, Belgium.Google Scholar
Ulrich, J., Murray, G., and Carenini, G. 2008. A publicly available annotated corpus for supervised email summarization. In Proceedings of Aaai Email-2008 Workshop, Chicago, USA.Google Scholar
Wilson, T. 2008. Annotating subjective content in meetings. In: Proceedings of Lrec 2008, Marrakech, Morocco.Google Scholar
Wilson, T., Wiebe, J., and Hwa, R. 2006. Recognizing strong and weak opinion clauses. Computational Intelligence 22 (2): 7399.Google Scholar
Yu, H., and Hatzivassiloglou, V. 2003. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of Emnlp 2003, Sapporo, Japan.Google Scholar