Hostname: page-component-8448b6f56d-t5pn6 Total loading time: 0 Render date: 2024-04-18T18:51:20.662Z Has data issue: false hasContentIssue false

Learning effective and engaging strategies for advice-giving human-machine dialogue

Published online by Cambridge University Press:  01 July 2009

MARTIJN SPITTERS
Affiliation:
Textkernel BV, Nieuwendammerkade 28/a17, 1022 AB Amsterdam, NL e-mail: spitters@textkernel.nl
MARCO DE BONI
Affiliation:
Unilever Corporate Research, Colworth House, Sharnbrook, Bedford, UKMK44 1LQ e-mail: marco.de-boni@unilever.com
JAKUB ZAVREL
Affiliation:
Textkernel BV, Nieuwendammerkade 28/a17, 1022 AB Amsterdam, NL
REMKO BONNEMA
Affiliation:
Textkernel BV, Nieuwendammerkade 28/a17, 1022 AB Amsterdam, NL

Abstract

We describe a system that automatically learns effective and engaging dialogue strategies, generated from a library of dialogue content, using reinforcement learning from user feedback. Besides the more usual clarification and verification components of dialogue, this library contains various social elements like greetings, apologies, small talk, relational questions and jokes. We tested the method through an experimental dialogue system that encourages take-up of exercise and shows that the learned dialogue policy performs as well as one built by human experts for this system.

Type
Papers
Copyright
Copyright © Cambridge University Press 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bickmore, T. W. 2003. Relational Agents: Effecting Change through Human-Computer Relationships. Ph.D. Thesis, MIT, Cambridge, MA.Google Scholar
Cassell, J. and Bickmore, T. W. 2003. Negotiated collusion: modeling social language and its relationship effects in intelligent agents. User Modeling and User-Adapted Interaction 13 (1–2): 89132.CrossRefGoogle Scholar
Cuayáhuitl, H., Renals, S., Lemon, O. and Shimodaira, H. 2006. Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces. In Proceedings of Interspeech-ICSLP, Pittsburgh, Pennsylvania, USA.Google Scholar
Daelemans, W., and van den Bosch, A. 2005. Memory-Based Language Processing. Cambridge University Press.Google Scholar
Daelemans, W., Buchholz, S., and Veenstra, J. 1999. Memory-Based Shallow Parsing. In Proceedings of CoNLL-99, Bergen, Norway.Google Scholar
English, M. S. and Heeman, P. A. 2005. Learning mixed initiative dialog strategies by using reinforcement learning on both conversants. In Proceedings of HLT/NAACL, Vancouver, British Columbia, Canada, ACL, Morristown, NJ, USA.Google Scholar
Frampton, M. and Lemon, O. 2006. Learning more effective dialogue strategies using limited dialogue move features. In Proceedings of the Annual Meeting of the ACL, Sydney, Australia, ACL, Morristown, NJ, USA.Google Scholar
Georgila, K. and Lemon, O. 2004. Adaptive multimodal dialogue management based on the information state update approach. W3C Workshop on Multimodal Interaction, Sophia-Antipolis, France.Google Scholar
Henderson, J., Lemon, O., and Georgila, K. 2005. Hybrid reinforcement/supervised learning for dialogue policies from COMMUNICATOR data. IJCAI workshop on Knowledge and Reasoning in Practical Dialogue Systems, Edinburgh, UK.Google Scholar
Levin, E., Pieraccini, R., and Eckert, W. 2000. A stochastic model of human-machine interaction for learning dialog strategies. IEEE Trans. on Speech and Audio Processing, 8 (1): 1123, IEEE Signal Processing Society, Piscataway, NJ, USA.Google Scholar
Likert, R. 1932. A technique for the measurement of attitudes. Archives of Psychology, 140: 55.Google Scholar
Litman, D. J. and Pan, S. 2002. Designing and evaluating an adaptive spoken dialogue system. User Modeling and User-Adapted Interaction 12 (2–3): 111137.Google Scholar
Liu, K. K. and Picard, R. W. 2005. Embedded empathy in continuous, interactive health assessment. CHI Workshop on HCI Challenges in Health Assessment, Portland, Oregon.Google Scholar
Maloor, P. and Chai, J. 2000. Dynamic user level and utility measurement for adaptive dialog in a help-desk system. In Proceedings of the 1st Sigdial Workshop, Hong Kong, China.Google Scholar
Paek, T. and Chickering, D. M. 2005. The markov assumption in spoken dialogue management. Proceedings of SIGDIAL 2005, Lisbon, Portugal.Google Scholar
Rieser, V. and Lemon, O. 2006. Using machine learning to explore human multimodal clarification strategies. In Proceedings of ACL, Sydney, Australia, ACL, Morristown, NJ, USA.CrossRefGoogle Scholar
Rudary, M., Singh, S., and Pollack, M. E. 2004. Adaptive cognitive orthotics: Combining reinforcement learning and constraint-based temporal reasoning. In Proceedings of the 21st International Conference on Machine Learning, Banff, Alberta, Canada, ACM, New York, USA.Google Scholar
Scheffler, K. and Young, S. 2002. Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning. In Proceedings of HLT-2002, San Diego, California, Morgan Kaufmann.Google Scholar
Singh, S., Litman, D., Kearns, M., and Walker, M. 2002. Optimizing dialogue management with reinforcement learning: experiments with the NJFun system. Journal of Artificial Intelligence Research (JAIR) 16: 105133.Google Scholar
Sutton, R. S. and Barto, A. G. 1998. Reinforcement Learning. MIT Press.Google Scholar
Stock, O. 1996. Password Swordfish: Verbal humour in the interface. In Proceedings of the International Workshop on Computational Humour, TWLT-12, Enschede.Google Scholar
Tetreault, J. R. and Litman, D. J. 2006. Comparing the utility of state features in spoken dialogue using reinforcement learning. In Proceedings of HLT/NAACL, New York, NY.CrossRefGoogle Scholar
Walker, M. A. 2000. An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email. Journal of Artificial Intelligence Research 12: 387416.CrossRefGoogle Scholar
Williams, J. D., Poupart, P., and Young, S. 2005. Partially observable markov decision processes with continuous observations for dialogue management. In Proceedings of the 6th SigDial Workshop, September 2005, Lisbon.Google Scholar
Zavrel, Jakub, and Daelemans, Walter. 2003. Feature-rich memory-based classification for shallow NLP and information extraction. In: Franke, J., Nakhaeizadeh, G. and Renz, I. (eds.). Text Mining, Theoretical Aspects and Applications. pp. 3354, Springer Physica-Verlag.Google Scholar