Natural Language Engineering

Articles

Subjectivity detection in spoken and written conversations

GABRIEL MURRAYa1 and GIUSEPPE CARENINIa1

a1 Department of Computer Science, 201-2366 Main Mall, University of British Columbia Vancouver, Canada V6T 1Z4 emails: gabrielm@cs.ubc.ca, carenini@cs.ubc.ca

Abstract

In this work we investigate four subjectivity and polarity tasks on spoken and written conversations. We implement and compare several pattern-based subjectivity detection approaches, including a novel technique wherein subjective patterns are learned from both labeled and unlabeled data, using n-gram word sequences with varying levels of lexical instantiation. We compare the use of these learned patterns with an alternative approach of using a very large set of raw pattern features. We also investigate how these pattern-based approaches can be supplemented and improved with features relating to conversation structure. Experimenting with meeting speech and email threads, we find that our novel systems incorporating varying instantiation patterns and conversation features outperform state-of-the-art systems despite having no recourse to domain-specific features such as prosodic cues and email headers. In some cases, such as when working with noisy speech recognizer output, a small set of well-motivated conversation features performs as well as a very large set of raw patterns.

(Received February 16 2010)

(Revised June 18 2010)

(Accepted August 23 2010)

(Online publication December 09 2010)