Hostname: page-component-8448b6f56d-mp689 Total loading time: 0 Render date: 2024-04-20T01:43:38.069Z Has data issue: false hasContentIssue false

Automatic annotation of context and speech acts for dialogue corpora

Published online by Cambridge University Press:  01 July 2009

KALLIRROI GEORGILA
Affiliation:
Institute for Creative Technologies, University of Southern California, 13274 Fiji Way, Marina del Rey, CA 90292, USA e-mail: kgeorgila@ict.usc.edu
OLIVER LEMON
Affiliation:
School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, EH8 9AB, UK, e-mail: olemon@inf.ed.ac.uk, j.moore@ed.ac.uk
JAMES HENDERSON
Affiliation:
Department of Computer Science, University of Geneva, Battelle bâtiment A, 7 route de Drize, 1227 Carouge, Switzerland e-mail: james.henderson@cui.unige.ch
JOHANNA D. MOORE
Affiliation:
School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, EH8 9AB, UK, e-mail: olemon@inf.ed.ac.uk, j.moore@ed.ac.uk

Abstract

Richly annotated dialogue corpora are essential for new research directions in statistical learning approaches to dialogue management, context-sensitive interpretation, and context-sensitive speech recognition. In particular, large dialogue corpora annotated with contextual information and speech acts are urgently required. We explore how existing dialogue corpora (usually consisting of utterance transcriptions) can be automatically processed to yield new corpora where dialogue context and speech acts are accurately represented. We present a conceptual and computational framework for generating such corpora. As an example, we present and evaluate an automatic annotation system which builds ‘Information State Update’ (ISU) representations of dialogue context for the Communicator (2000 and 2001) corpora of human–machine dialogues (2,331 dialogues). The purposes of this annotation are to generate corpora for reinforcement learning of dialogue policies, for building user simulations, for evaluating different dialogue strategies against a baseline, and for training models for context-dependent interpretation and speech recognition. The automatic annotation system parses system and user utterances into speech acts and builds up sequences of dialogue context representations using an ISU dialogue manager. We present the architecture of the automatic annotation system and a detailed example to illustrate how the system components interact to produce the annotations. We also evaluate the annotations, with respect to the task completion metrics of the original corpus and in comparison to hand-annotated data and annotations produced by a baseline automatic system. The automatic annotations perform well and largely outperform the baseline automatic annotations in all measures. The resulting annotated corpus has been used to train high-quality user simulations and to learn successful dialogue strategies. The final corpus will be made publicly available.

Type
Papers
Copyright
Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Andreani, G., Di Fabbrizio, G., Gilbert, M., Gillick, D., Hakkani-Tür, D., and Lemon, O. 2006. Let's DiSCoH: collecting an annotated open corpus with dialogue acts and reward signals for natural language helpdesks. In Proceedings of the IEEE/ACL Workshop on Spoken Language Technology, Aruba, 2006, pp. 218–21.Google Scholar
Bos, J., Klein, E., Lemon, O., and Oka, T. 2003. DIPPER: description and formalisation of an Information-State Update dialogue system architecture. In Proceedings of the 4th SIGdial Workshop on Discourse and Dialogue, Sapporo, Japan, pp. 115–24.Google Scholar
Cheyer, A., and Martin, D. 2001. The open agent architecture. Journal of Autonomous Agents and Multi-Agent Systems, 4 (1/2): 143–8.CrossRefGoogle Scholar
Clark, H. H., and Brennan, S. E. 1991. Grounding in communication. In Resnick, L., Levine, J., and Teasely, S. (eds.), Perspectives on Socially Shared Cognition, pp. 127–49. American Psychological Association.CrossRefGoogle Scholar
Core, M. G., Moore, J. D., and Zinn, C. 2003. The role of initiative in tutorial dialogue. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Budapest, Hungary, pp. 67–74.Google Scholar
Frampton, M., and Lemon, O. 2006. Learning more effective dialogue strategies using limited dialogue move features. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics (ACL), Sydney, Australia, pp. 185–92.Google Scholar
Gabsdil, M., and Lemon, O. 2004. Combining acoustic and pragmatic features to predict recognition performance in spoken dialogue systems. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL), Barcelona, Spain, pp. 344–51.Google Scholar
Georgila, K., Henderson, J., and Lemon, O. 2005a. Learning user simulations for Information State Update dialogue systems. In Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH–EUROSPEECH), Lisbon, Portugal, pp. 893–6.Google Scholar
Georgila, K., Henderson, J., and Lemon, O. 2006. User simulation for spoken dialogue systems: learning and evaluation. In Proceedings of the 9th International Conference on Spoken Language Processing (INTERSPEECH–ICSLP), Pittsburgh, PA, pp. 1065–68.Google Scholar
Georgila, K., Lemon, O., and Henderson, J. 2005b. Automatic annotation of COMMUNICATOR dialogue data for learning dialogue strategies and user simulations. In Proceedings of the 9th Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL: DIALOR), Nancy, France, pp. 61–8.Google Scholar
Georgila, K., Wolters, M., Karaiskos, V., Kronenthal, M., Logie, R., Mayo, N., Moore, J. D., and Watson, M. 2008a. A fully annotated corpus for studying the effect of cognitive ageing on users' interactions with spoken dialogue systems. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Marrakech, Morocco, pp. 938–44.Google Scholar
Georgila, K., Wolters, M., and Moore, J. D. 2008b. Simulating the behaviour of older versus younger users when interacting with spoken dialogue systems. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL–HLT), Columbus, OH, pp. 49–52.Google Scholar
Ginzburg, J. 1996. Dynamics and semantics of dialogue. In Seligman, Jerry and Westerstahl, Dag (eds.), Logic, Language, and Computation, Vol. 1. CSLI Publications, Stanford, CA.Google Scholar
Grosz, B. J., and Sidner, C. L. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 12 (3): 175204.Google Scholar
Henderson, J., Lemon, O., and Georgila, K. 2005. Hybrid reinforcement/supervised learning for dialogue policies from COMMUNICATOR data. In Proceedings of the 4th Workshop on Knowledge and Reasoning in Practical Dialogue Systems, International Joint Conference on Artificial Intelligence (IJCAI), Edinburgh, UK, pp. 68–75.Google Scholar
Henderson, J., Lemon, O., and Georgila, K. 2008. Hybrid reinforcement/supervised learning of dialogue policies from fixed datasets. Computational Linguistics 34 (4): 487511.CrossRefGoogle Scholar
Keizer, S., and op den Akker, R. 2007. Dialogue act recognition under uncertainty using Bayesian networks. Journal of Natural Language Engineering 13 (4): 287316.CrossRefGoogle Scholar
Kipp, M. 1998 The neural path to dialogue acts. In Proceedings of the 13th European Conference on Artificial Intelligence (ECAI), Brighton, UK, pp. 175–9.Google Scholar
Larsson, S., and Traum, D. 2000. Information state and dialogue management in the TRINDI Dialogue Move Engine Toolkit. Journal of Natural Language Engineering 6 (3–4): 323–40.CrossRefGoogle Scholar
Lemon, O., Georgila, K., and Henderson, J. 2006a. Evaluating effectiveness and portability of reinforcement learned dialogue strategies with real users: the TALK TownInfo evaluation. In Proceedings of the IEEE/ACL Workshop on Spoken Language Technology, Aruba, pp. 178–81.Google Scholar
Lemon, O., Georgila, K., Henderson, J., and Stuttle, M. 2006b. An ISU dialogue system exhibiting reinforcement learning of dialogue policies: generic slot-filling in the TALK in-car system. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Trento, Italy, pp. 119–22.Google Scholar
Lemon, O., and Gruenstein, A. 2003. Multithreaded context for robust conversational interfaces: context-sensitive speech recognition and interpretation of corrective fragments. ACM Transactions on Computer–Human Interaction (ACM TOCHI), 11 (3): 241–67.CrossRefGoogle Scholar
Lesch, S., Kleinbauer, T., and Alexandersson, J. 2005. A new metric for the evaluation of dialog act classification. In Proceedings of the 9th Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL: DIALOR), Nancy, France, pp. 143–6.Google Scholar
Levin, E., Pieraccini, R., and Eckert, W. 2000. A stochastic model of human–machine interaction for learning dialog strategies. IEEE Transactions on Speech and Audio Processing 1: 1123.CrossRefGoogle Scholar
Litman, D., and Forbes-Riley, K. 2006. Correlations between dialogue acts and learning in spoken tutoring dialogues. Journal of Natural Language Engineering 12 (2): 161–76.CrossRefGoogle Scholar
Poesio, M., Cooper, R., Larsson, S., Matheson, C., and Traum, D. 1999. Annotating conversations for information state update. In Proceedings of the 3rd Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL: AMSTELOGUE), Amsterdam, Netherlands.Google Scholar
Reithinger, N., and Klesen, M. 1997. Dialogua act classification using language models. In Proceedings of the 5th European Conference on Speech Communication and Technology (EUROSPEECH), Rhodes, Greece, pp. 2235–8.Google Scholar
Reithinger, N., and Maier, E. 1995. Utilizing statistical dialogue act processing in VERBMOBIL. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL), Cambridge, MA, pp. 116–21.Google Scholar
Ries, K. 1999. HMM and neural network based speech act detection. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Phoenix, AZ, pp. 497–500.Google Scholar
Rieser, V., Kruijff-Korbayová, I., and Lemon, O. 2005a. A corpus collection and annotation framework for learning multimodal clarification strategies. In Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue, Lisbon, Portugal, pp. 97–106.Google Scholar
Rieser, V., Kruijff-Korbayová, I., and Lemon, O. 2005b. A framework for learning multimodal clarification strategies. In Proceedings of the 7th International Conference on Multimodal Interfaces (ICMI), Trento, Italy.Google Scholar
Rieser, V., and Lemon, O. 2006. Using machine learning to explore human multimodal clarification strategies. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics (ACL), Sydney, Australia, pp. 659–66.Google Scholar
Rieser, V., and Lemon, O. 2008. Learning effective multimodal dialogue strategies from Wizard-of-Oz data: Bootstrapping and evaluation. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL–HLT), Columbus, OH, pp. 638–46.Google Scholar
Samuel, K., Carberry, S., and Vijay-Shanker, K. 1998. Dialogue act tagging with transformation-based learning. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (ACL–COLING), Montreal, Quebec, Canada, pp. 1150–6.Google Scholar
Schatzmann, J., Georgila, K., and Young, S. 2005a Quantitative evaluation of user simulation techniques for spoken dialogue systems. In Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue, Lisbon, Portugal, pp. 45–54.Google Scholar
Schatzmann, J., Stuttle, M. N., Weilhammer, K., and Young, S. 2005b. Effects of the user model on simulation-based learning of dialogue strategies. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, San Juan, Puerto Rico, pp. 220–5.Google Scholar
Schatzmann, J., Thomson, B., and Young, S. 2007. Statistical user simulation with a hidden agenda. In Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, Antwerp, Belgium, pp. 273–82.Google Scholar
Schatzmann, J., Weilhammer, K., Stuttle, N., and Young, S. 2006. A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. Knowledge Engineering Review 21 (2): 97126.CrossRefGoogle Scholar
Scheffler, K., and Young, S. 2001. Corpus-based dialogue simulation for automatic strategy learning and evaluation. In Proceedings of the Workshop on Adaptation in Dialogue Systems, North American Chapter of the Association for Computational Linguistics (NAACL), Pittsburgh, PA, pp. 64–70.Google Scholar
Searle, J. 1969. Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press: Cambridge, UK.CrossRefGoogle Scholar
Singh, S., Kearns, M., Litman, D., and Walker, M. 1999. Reinforcement learning for spoken dialogue systems. Advances in Neural Information Processing Systems 12: 956–62.Google Scholar
Stolcke, A., Coccaro, N., Bates, R., Taylor, P., Ess-Dykema, C. V., Ries, K., Shriberg, E., Jurafsky, D., Martin, R., and Meteer, M. 2000. Dialogue act modelling for automatic tagging and recognition of conversational speech. Computational Linguistics 26 (3): 339–74.CrossRefGoogle Scholar
Traum, D. 1994. A computational theory of grounding in natural language conversation. PhD Thesis, Department of Computer Science, University of Rochester.Google Scholar
Traum, D. 2000. Twenty questions for dialogue act taxonomies. Journal of Semantics 17 (1): 730.CrossRefGoogle Scholar
Traum, D. R., and Allen, J. 1994. Discourse obligations in dialogue processing. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, pp. 1–8.Google Scholar
Walker, M., Aberdeen, J., Boland, J., Bratt, E., Garofolo, J., Hirschman, L., Le, A., Lee, S., Narayanan, S., Papineni, K., Pellom, B., Polifroni, J., Potamianos, A., Prabhu, P., Rudnicky, A., Sanders, G., Seneff, S., Stallard, D., and Whittaker, S. 2001a. DARPA Communicator dialog travel planning systems: The June 2000 data collection. In Proceedings of the 7th European Conference on Speech Communication and Technology (EUROSPEECH), Aalborg, Denmark, pp. 1371–4.Google Scholar
Walker, M. A., Fromer, J. C., and Narayanan, S. 1998. Learning optimal dialogue strategies: a case study of a spoken dialogue agent for email. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (ACL–COLING), Montreal, Quebec, Canada, pp. 1345–51.Google Scholar
Walker, M. A., Kamm, C. A., and Litman, D. J. 2000. Towards developing general models of usability with PARADISE. Journal of Natural Language Engineering (Special Issue on Best Practice in Spoken Dialogue Systems) 6 (3): 363–77.CrossRefGoogle Scholar
Walker, M. A., Passonneau, R. J., and Boland, J. E. 2001b. Quantitative and qualitative evaluation of DARPA Communicator spoken dialogue systems. In Proceedings of the 39th Meeting of the Association for Computational Linguistics (ACL), Toulouse, France, pp. 515–22.Google Scholar
Walker, M., and Passonneau, R. 2001. DATE: a dialogue act tagging scheme for evaluation of spoken dialogue systems. In Proceedings of the Human Language Technologies Conference, San Diego, CA, pp. 1–8.Google Scholar
Walker, M., Rudnicky, A., Aberdeen, J., Bratt, E., Garofolo, J., Hastie, H., Le, A., Pellom, B., Potamianos, A., Passonneau, R., Prasad, R., Roukos, S., Sanders, G., Seneff, S., Stallard, D., and Whittaker, S. 2002. DARPA communicator evaluation: progress from 2000 to 2001. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP), Denver, CO, pp. 273–6.Google Scholar
Webb, N., Hepple, M., and Wilks, Y. 2005. Dialogue act classification based on intra-utterance features. In Proceedings of the AAAI Workshop on Spoken Language Understanding, Pittsburgh, PA.CrossRefGoogle Scholar
Williams, J. D., and Young, S. 2005. Scaling up POMDPs for dialog management: the “summary POMDP” method. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, San Juan, Puerto Rico, pp. 177–82.Google Scholar
Young, S. 2000. Probabilistic methods in spoken dialogue systems. Philosophical Transactions of the Royal Society (Series A) 358 (1769): 1389–402.CrossRefGoogle Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, T., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., and Woodland, P. 2005. The HTK Book (for HTK version 3.3). Cambridge University Engineering Department.Google Scholar
Zinn, C., Moore, J. D., and Core, M. G. 2002. A 3-tier planning architecture for managing tutorial dialogue. In Proceedings of the Intelligent Tutoring Systems, Sixth International Conference (ITS), Biarritz, France, pp. 574–84. Lecture Notes in Computer Science vol. 2363, Berlin: Springer.CrossRefGoogle Scholar