Predicting word choice in affective text†

M. GARDINER; M. DRAS

doi:10.1017/S1351324915000157

Predicting word choice in affective text†

Published online by Cambridge University Press: 22 May 2015

M. GARDINER and

M. DRAS

Show author details

M. GARDINER: Affiliation:
Macquarie University, North Ryde, NSW 2109, Australia e-mail: mark.dras@mq.edu.au
M. DRAS: Affiliation:
Macquarie University, North Ryde, NSW 2109, Australia e-mail: mark.dras@mq.edu.au

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Choosing the best word or phrase for a given context from among the candidate near-synonyms, such as slim and skinny, is a difficult language generation problem. In this paper, we describe approaches to solving an instance of this problem, the lexical gap problem, with a particular focus on affect and subjectivity; to do this we draw upon techniques from the sentiment and subjectivity analysis fields. We present a supervised approach to this problem, initially with a unigram model that solidly outperforms the baseline, with a 6.8% increase in accuracy. The results to some extent confirm those from related problems, where feature presence outperforms feature frequency, and immediate context features generally outperform wider context features. However, this latter is somewhat surprisingly not always the case, and not necessarily where intuition might first suggest; and an analysis of where document-level models are in some cases better suggested that, in our corpus, broader features related to the ‘tone’ of the document could be useful, including document sentiment, document author, and a distance metric for weighting the wider lexical context of the gap itself. From these, our best model has a 10.1% increase in accuracy, corresponding to a 38% reduction in errors. Moreover, our models do not just improve accuracy on affective word choice, but on non-affective word choice also.

Type: Articles
Information: Natural Language Engineering , Volume 22 , Issue 1 , January 2016 , pp. 97 - 134

DOI: https://doi.org/10.1017/S1351324915000157 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

†

The authors would like to thank the anonymous reviewers of the article, and to acknowledge the support of ARC Discovery grant DP0558852.

References

Banerjee, S., and Pedersen, T. 2003. The design, implementation and use of the Ngram statistics package. In Proceedings of the 4th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City.Google Scholar

Bieler, H., Dipper, S., and Stede, M., 2007. Identifying formal and functional zones in film reviews. In Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, Antwerp, Belgium, pp. 75–85.Google Scholar

Brants, T., and Franz, A. 2006. Web 1T 5-gram Version 1. http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T13.Google Scholar

Carbonell, J., Klein, S., Miller, D., Steinbaum, M., Grassiany, T., and Frei, J., 2006. Context-based machine translation. In Proceedings of the 7th Conference of the Association for Machine Translation of the Americas (AMTA), Cambridge, MA, US, pp. 19–28.Google Scholar

Cerini, S., Compagnoni, V., Demontis, A., Formentelli, M., and Gandi, C. 2007. Micro-WNOp: a gold standard for the evaluation of automatically compiled lexical resources for opinion mining. In Language Resources and Linguistic Theory: Typology, Second Language Acquisition, English Linguistics, Franco Angeli, Milan, Italy.Google Scholar

Church, K., Gale, W., Hanks, P., and Hindle, D. 1989. Parsing, word associations and typical predicate-argument relations. In Proceedings of the International Workshop on Parsing Technologies, Pittsburgh, PA, US.Google Scholar

Church, K., Gale, W., Hanks, P., and Hindle, D. 1991. Using statistics in lexical analysis. In Zernick, U. (ed.), Lexical Acquisition: Using On-line Resources to Build a Lexicon, pp. 115–164. Lawrence Erlbaum Associates, Hillsdale, NJ, US.Google Scholar

Church, K., and Hanks, P. 1991. Word association norms, mutual information and lexicography. Computational Linguistics 16 (1): 22–29.Google Scholar

Clarke, C. L. A., and Terra, E. L., 2003. Passage retrieval versus document retrieval for factoid question answering. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada, pp. 427–428.Google Scholar

DiMarco, C., Hirst, G., and Stede, M., 1993. The semantic and stylistic differentiation of synonyms and near-synonyms. In Proceedings of AAAI Spring Symposium on Building Lexicons for Machine Translation, Stanford, CA, USA, pp. 114–121.Google Scholar

Edmonds, P. 1997. Choosing the word most typical in context using a lexical co-occurrence network. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, Spain, pp. 507–509. Association for Computational Linguistics.CrossRef Google Scholar

Edmonds, P. 1999. Semantic Representations of Near-Synonyms for Automatic Lexical Choice. PhD thesis, University of Toronto, Toronto, Canada.Google Scholar

Edmonds, P., and Hirst, G., 2002. Near-synonymy and lexical choice. Computational Linguistics 28 (2): 105–144.CrossRef Google Scholar

Esuli, A., and Sebastiani, F., 2006. SentiWordNet: a publicly available lexical resource for opinion mining. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genova, Italy, pp. 417–422.Google Scholar

Fellbaum, C. (ed.) 1998. WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, US.CrossRef Google Scholar

Foody, G. M. 2008. Sample size determination for image classification accuracy assessment and comparison. In Proceedings of the 8th International Symposium on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences, Shanghai, pp. 154–162.Google Scholar

Gale, W. A., and Sampson, G., 1995. Good-turing frequency estimation without tears. Journal of Quantitative Linguistics 2: 217–232.CrossRef Google Scholar

Gallo, C. G., Jaeger, T. F., and Smyth, R., 2008. Incremental syntactic planning across clauses. In Proceedings of the 30th Annual Meeting of the Cognitive Science Society (CogSci08), Washington, DC, US, pp. 845–850.Google Scholar

Gamon, M. 2004. Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In COLING ’04: Proceedings of the 20th International Conference on Computational Linguistics, Morristown, NJ, USA, pp. 841–847. Association for Computational Linguistics.CrossRef Google Scholar

Gardiner, M., and Dras, M., 2007a. Corpus statistics approaches to discriminating among near-synonyms. In Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics (PACLING 2007), Melbourne, Australia, pp. 31–39.Google Scholar

Gardiner, M., and Dras, M., 2007b. Exploring approaches to discriminating among near-synonyms. In Proceedings of the Australasian Language Technology Workshop 2007, Melbourne, Australia, pp. 31–39.Google Scholar

Genzel, D., and Charniak, E., 2002. Entropy rate constancy in text. In Proceedings of the 40th Annual Meetings of the Association for Computational Linguistics (ACL’02), Philadelphia, US, pp. 199–206.Google Scholar

Genzel, D., and Charniak, E., 2003. Variation of entropy and parse trees of sentences as a function of the sentence number. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Sapporo, Japan, pp. 65–72.Google Scholar

Good, I. J., 1953. The population frequencies of species and the estimation of population parameters. Biometrika 40 (3–4): 237–264.CrossRef Google Scholar

Hassan, S., Csomai, A., Banea, C., Sinha, R., and Mihalcea, R. 2007. UNT: SubFinder: combining knowledge sources for automatic lexical substitution. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, pp. 410–413. Association for Computational Linguistics.CrossRef Google Scholar

Hatzivassiloglou, V., and Wiebe, J. M., 2000. Effects of adjective orientation and gradability on sentence subjectivity. In Proceedings of the 18th International Conference on Computational Linguistics (COLING-2000), Saarbrcken, Germany, pp. 299–305.CrossRef Google Scholar

Hawker, T., Gardiner, M., and Bennetts, A., 2007. Practical queries of a massive n-gram database. In Proceedings of the Australasian Language Technology Workshop 2007, Melbourne, Australia, pp. 40–48.Google Scholar

Hayakawa, S. I. (ed.) 1968. Use The Right Word: Modern Guide to Synonyms and Related Words, 1st ed.The Reader’s Digest Association Pty. Ltd., New York, NY, US.Google Scholar

Hayakawa, S. I. (ed.) 1994. Choose the Right Word (2nd edition. Harper Collins Publishers. revised by Eugene Ehrlich, New York, NY, US.Google Scholar

Ide, N., and Vronis, J., 1998. Introduction to the special issue on word sense disambiguation: the state of the Art. Computational Linguistics 24 (1): 1–40.Google Scholar

Inkpen, D. 2007a. Near-synonym choice in an intelligent thesaurus. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, New York, pp. 356–363. Association for Computational Linguistics.Google Scholar

Inkpen, D., 2007b. A statistical model for near-synonym choice. ACM Transactions of Speech and Language Processing 4 (1): 1–17.CrossRef Google Scholar

Inkpen, D., and Hirst, G., 2006. Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics 32 (2): 223–262.CrossRef Google Scholar

Inkpen, D. Z., Feiguina, O., and Hirst, G. 2006. Generating more-positive or more-negative text. In Shanahan, J. G., Qu, Y., and Wiebe, J. (eds.), Computing Attitude and Affect in Text (Selected papers from the Proceedings of the Workshop on Attitude and Affect in Text, AAAI 2004 Spring Symposium), pp. 187–196. Springer, Dordrecht, The Netherlands, Dordrecht, The Netherlands.Google Scholar

Islam, A., and Inkpen, D., 2010. Near-synonym choice using a 5-gram language model. Research in Computing Science: Special issue on Natural Language Processing and its Applications 46: 41–52.Google Scholar

Islam, M. A. 2011. An Unsupervised Approach to Detecting and Correcting Errors in Text. PhD thesis, University of Ottawa, Ottawa, Canada.Google Scholar

Joachims, T. 1999. Making large-scale SVM learning practical. In Schlkopf, B., Burges, C. J., and Smola, A. J. (eds.), Advances in Kernel Methods - Support Vector Learning, pp. 169–184. Cambridge, USA: The MIT Press.Google Scholar

Jurafsky, D., and Martin, J. H. 2009. Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics, 2nd ed.Prentice-Hall, Upper Saddle River, NJ, USA.Google Scholar

Katz, S. M., 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech and Signal Processing 35: 400–401.CrossRef Google Scholar

Keller, F. 2004. The entropy rate principle as a predictor of processing effort: an evaluation against eye-tracking data. In Lin, D., and Wu, D. (eds.), Proceedings of EMNLP 2004, Barcelona, Spain, pp. 317–324. Association for Computational Linguistics.Google Scholar

Koppel, M., Akiva, N., and Dagan, I., 2006a. Feature instability as a criterion for selecting potential style markers. Journal of the American Society for Information Science and Technology 57 (11): 1519–1525.CrossRef Google Scholar

Koppel, M., Akiva, N., and Dagan, I., 2006b. Feature Instability as a Criterion for Selecting Potential Style Markers. Journal of the American Society for Information Science and Technology 57 (11): 1519–1525.CrossRef Google Scholar

Kullback, S., and Leibler, R. A., 1951. On Information and Sufficiency. The Annals of Mathematical Statistics 22: 79–86.CrossRef Google Scholar

Landauer, T., and Dumais, S., 1997. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104 (2): 211–240.CrossRef Google Scholar

Langkilde, I., and Knight, K., 1998. The practical value of N-grams in generation. In Proceedings of the 9th International Natural Language Generation Workshop, Niagra-on-the-Lake, Canada, pp. 248–255.Google Scholar

Levy, R., and Jaeger, T. F. 2007. Speakers optimize information density through syntactic reduction. In Schlkopf, B., Platt, J., and Hoffman, T. (eds.), Advances in Neural Information Processing Systems 19, Cambridge, MA: MIT Press.Google Scholar

Liu, Y., and Zheng, Y. F. 2005. One-against-all multi-class SVM classification using reliability measures. In Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, (IJCNN ’05). vol. 2, pp. 849–854.Google Scholar

McCarthy, D., and Navigli, R., 2007. SemEval-2007 Task 10: english lexical substitution task. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic. Association for Computational Linguistics, pp. 48–53.CrossRef Google Scholar

Özgür, L., and Güngör, T. 2010. Text classification with the support of pruned dependency patterns. Pattern Recognition Letters 31 (12): 1598–1607.CrossRef Google Scholar

Paltoglou, G., and Thelwall, M. 2010. A study of information retrieval weighting schemes for sentiment analysis. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 1386–1395. Association for Computational Linguistics.Google Scholar

Pang, B., and Lee, L., 2004. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL’04), Main Volume, Barcelona, Spain, pp. 271–278.Google Scholar

Pang, B., and Lee, L. 2005. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, Michigan, pp. 115–124. Association for Computational Linguistics.Google Scholar

Pang, B., and Lee, L., 2008. Opinion mining and sentiment nnalysis. Foundations and Trends in Information Retrieval 2 (1–2): 1–135.CrossRef Google Scholar

Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pp. 79–86. Association for Computational Linguistics, Philadelphia, PA, US.Google Scholar

Qian, T., and Jaeger, T. F. 2010. Close = Relevant? The role of context in efficient language production. In Proceedings of the 2010 Workshop on Cognitive Modeling and Computational Linguistics, Uppsala, Sweden, pp. 45–53. Association for Computational Linguistics.Google Scholar

Rapp, R., 2008. The automatic generation of thesauri of related words for English, French, German, and Russian. International Journal of Speech Technology 11 (3–4): 147–156.CrossRef Google Scholar

Refaeilzadeh, P., Tang, L., and Liu, H. 2009. Cross validation. In Tamer, M., and Liu, L. (eds.), Encyclopedia of Database Systems. Springer, New York, NY, US.Google Scholar

Reiter, E., and Dale, R. 2000. Building Natural Language Generation Systems. Cambridge University Press, Cambridge, UK.CrossRef Google Scholar

Rifkin, R., and Klautau, A., 2004. In defense of one-vs-all classification. Journal of Machine Learning Research 5 (2): 101–141.Google Scholar

Rosen-Zvi, M., Griffiths, T., Steyvers, M., and Smyth, P., 2004. The author-topic model for authors and documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, Banff, Canada, pp. 487–494.Google Scholar

Salzberg, S. L., 1997. On comparing classifiers: pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1: 317–327.CrossRef Google Scholar

Sinclair, J. 1987. The nature of the evidence. In Sinclair, J. M. (ed.), Looking Up: An Account of the COBUILD Project in Lexical Computing and the Development of the Collins COBUILD English Language Dictionary, pp. 150–159. London, UK: HarperCollins Publishers Ltd.Google Scholar

Sinclair, J., 1991. Corpus, Concordance, Collocation. Oxford, UK: Oxford University Press.Google Scholar

Sinha, R. S., and Mihalcea, R., 2014. Explorations in lexical sample and all-words lexical substitution. Natural Language Engineering 20 (1): 99–129.CrossRef Google Scholar

Snyder, B., and Barzilay, R. 2007. Multiple aspect ranking using the good grief algorithm. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, New York, pp. 300–307. Association for Computational Linguistics.Google Scholar

Sprent, P., and Smeeton, N. C. 2007. Applied Nonparametric Statistical Methods, 4th ed. Texts in Statistical Science. Chapman and Hall/CRC, Boca Raton, FL, US.Google Scholar

Stewart, D., 2010. Semantic Prosody: A Critical Evaluation. New York, US: Routledge.CrossRef Google Scholar

Stone, P. J., Dunphy, D. C., Smith, M. S., and Ogilvie, D. M. 1966. General Inquirer: A Computer Approach to Content Analysis. The MIT Press, Cambridge, MA, US.Google Scholar

Stubbs, M., 2001. Words and Phrases: Corpus Studies of Lexical Semantics. Oxford, UK: Blackwell Publishing.Google Scholar

Turney, P. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 417–424. Association for Computational Linguistics.Google Scholar

Wang, T., and Hirst, G. 2010. Near-synonym lexical choice in latent semantic space. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, China, pp. 1182–1190. Coling 2010 Organizing Committee.Google Scholar

Wiebe, J., Wilson, T., Bruce, R., Bell, M., and Martin, M., 2004. Learning subjective language. Computational Linguistics 30 (3): 277–308.CrossRef Google Scholar

Xu, P., Chelba, C., and Jelinek, F., 2002. A study on richer syntactic dependencies for structured language modeling. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02), Philadelphia, PA, US, pp. 191–198.Google Scholar

Yang, Y., and Pedersen, J. O. 1997. A comparative study on feature selection in text categorization. In Proceedings of the Fourteenth International Conference on Machine Learning, San Francisco, USA, pp. 412–420. Morgan Kaufmann Publishers Inc.Google Scholar

Yu, H., and Hatzivassiloglou, V., 2003. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP 2003), Sapporo, Japan, pp. 129–136.CrossRef Google Scholar

Yu, L.-C., Shih, H.-M., Lai, Y.-L., Yeh, J.-F., and Wu, C.-H. 2010. Discriminative training for near-synonym substitution. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, China, pp. 1254–1262. Coling 2010 Organizing Committee.Google Scholar

Yuret, D. 2007. KU: word sense disambiguation by substitution. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, pp. 207–214. Association for Computational Linguistics.CrossRef Google Scholar

Article contents

Predicting word choice in affective text†

Abstract

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests