Backward and trigger-based language models for statistical machine translation

DEYI XIONG; MIN ZHANG

doi:10.1017/S1351324913000168

Backward and trigger-based language models for statistical machine translation

Published online by Cambridge University Press: 24 July 2013

DEYI XIONG and

MIN ZHANG

Show author details

DEYI XIONG: Affiliation:
School of Computer Science and Technology, Soochow University, Suzhou 215006, China email: dyxiong@suda.edu.cn, minzhang@suda.edu.cn
MIN ZHANG: Affiliation:
School of Computer Science and Technology, Soochow University, Suzhou 215006, China email: dyxiong@suda.edu.cn, minzhang@suda.edu.cn

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The language model is one of the most important knowledge sources for statistical machine translation. In this article, we present two extensions to standard n-gram language models in statistical machine translation: a backward language model that augments the conventional forward language model, and a mutual information trigger model which captures long-distance dependencies that go beyond the scope of standard n-gram language models. We introduce algorithms to integrate the two proposed models into two kinds of state-of-the-art phrase-based decoders. Our experimental results on Chinese/Spanish/Vietnamese-to-English show that both models are able to significantly improve translation quality in terms of BLEU and METEOR over a competitive baseline.

Type: Articles
Information: Natural Language Engineering , Volume 21 , Issue 2 , March 2015 , pp. 201 - 226

DOI: https://doi.org/10.1017/S1351324913000168 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2013

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Banerjee, S., and Lavie, A., 2005. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, MI, pp. 65–72.Google Scholar

Brants, T., Popat, A. C., Xu, P., Och, F. J., and Dean, J., 2007. Large language models in machine translation. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, pp. 858–67.Google Scholar

Callison-Burch, C., Koehn, P., Monz, C., Peterson, K., Przybocki, M., and Zaidan, O., 2010. Findings of the 2010 joint workshop on statistical machine translation and metrics for machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, Uppsala, Sweden, pp. 17–53.Google Scholar

Charniak, E., Knight, K., and Yamada, K., 2003. Syntax-based language models for statistical machine translation. In Proceedings of MT Summit IX, New Orleans, USA, pp. 40–46.Google Scholar

Chen, B., Xiong, D., Zhang, M., Aw, A., and Li, H., 2008. I2r multi-pass machine translation system for iwslt 2008. In Proceeding of the International Workshop on Spoken Language Translation 2008, Hawaii, USA, pp. 46–51.Google Scholar

Chiang, D., 2007. Hierarchical phrase-based translation. Computational Linguistics 33 (2): 201–28.Google Scholar

Church, K. W., and Hanks, P., 1990. Word association norms, mutual information, and lexicography. Computational Linguistics 16 (1): 22–9.Google Scholar

Clark, J. H., Dyer, C., Lavie, A., and Smith, N. A., 2011. Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 176–81.Google Scholar

Duchateau, J., Demuynck, K., and Wambacq, P., 2002. Confidence scoring based on backward language models. In Proceedings of ICASSP, Orlando, FL, pp. 221–4.Google Scholar

Emami, A., Papineni, K., and Sorensen, J., 2007. Large-scale distributed language modeling. In Proceedings of ICASSP, Honolulu, HI, pp. 37–40.Google Scholar

Finch, A., and Sumita, E. 2009. Bidirectional phrase-based statistical machine translation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 1124–32.Google Scholar

He, X., Yang, M., Gao, J., Nguyen, P., and Moore, R., 2008. Indirect-HMM-based hypothesis alignment for combining outputs from machine translation systems. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, pp. 98–107.Google Scholar

Koehn, P., 2005. Europarl: a parallel corpus for statistical machine translation. In the tenth Machine Translation Summit, Phuket, Thailand, pp. 79–86.Google Scholar

Koehn, P., Och, F. J., and Marcu, D., 2003. Statistical phrase-based translation. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, Canada, pp. 58–54.Google Scholar

Marcus, M. P., Santorini, B., and Marcinkiewicz, M. A., 1993. Building a large annotated corpus of English: the penn treebank. Computational Linguistics 19 (2): 313–30.Google Scholar

Mauser, A., Hasan, S., and Ney, H. 2009. Extending statistical machine translation with discriminative and trigger-based lexicon models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 210–18.Google Scholar

Och, F. J., 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, pp. 160–7.Google Scholar

Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J., 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, pp. 311–18.Google Scholar

Post, M., and Gildea, D., 2008. Parsers as language models for statistical machine translation. In Proceedings of AMTA, Waikiki, Hawai'i, pp. 172–181.Google Scholar

Raybaud, S., Lavecchia, C., Langlois, D., and Smaïli, K., 2009. New confidence measures for statistical machine translation. In Proceedings of the International Conference on Agents and Artificial Intelligence, Porto, Portugal, pp. 61–8.Google Scholar

Rosenfeld, R., Carbonell, J., and Rudnicky, A. 1994. Adaptive statistical language modeling: a maximum entropy approach. Technical Report, Carnegie Mellon University.Google Scholar

Shen, L., Xu, J., and Weischedel, R., 2008. A new string-to-dependency machine translation algorithm with a target dependency language model. In Proceedings of ACL-08: HLT, Columbus, Ohio, pp. 577–85.Google Scholar

Stolcke, A., 2002. Srilm–an extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing, Denver, Colorado, USA, pp. 901–4.Google Scholar

Talbot, D., and Osborne, M., 2007. Randomised language modelling for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 512–19.Google Scholar

Wu, D., 1996. A polynomial-time algorithm for statistical machine translation. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, USA, pp. 152–8.Google Scholar

Wu, D., 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics 23 (3): 377–403.Google Scholar

Xiong, D., Liu, Q., and Lin, S., 2006. Maximum entropy based phrase reordering model for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 521–8.Google Scholar

Xiong, D., Zhang, M., and Li, H., 2011. Enhancing language models in statistical machine translation with backward n-grams and mutual information triggers. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 1288–97.Google Scholar

Zhang, Y., Hildebrand, A. S., and Vogel, S., 2006. Distributed language modeling for n-best list re-ranking. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 216–23.Google Scholar

Zhou, G., 2004. Modeling of long distance context dependency. In Proceedings of Coling, Geneva, Switzerland, pp. 92–8.Google Scholar

Article contents

Backward and trigger-based language models for statistical machine translation

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests