Hostname: page-component-8448b6f56d-tj2md Total loading time: 0 Render date: 2024-04-25T01:55:07.871Z Has data issue: false hasContentIssue false

Corpus-based learning of Cantonese for Mandarin speakers

Published online by Cambridge University Press:  17 March 2016

Tak-Sum Wong
Affiliation:
City University of Hong Kong, Hong Kong (email: tswong-c@my.cityu.edu.hk)
John S. Y. Lee
Affiliation:
City University of Hong Kong, Hong Kong (email: jsylee@cityu.edu.hk)

Abstract

This article presents the first study on using a parallel corpus to teach Cantonese, the variety of Chinese spoken in Hong Kong. We evaluated this approach with Mandarin-speaking undergraduate students at the beginner level. Exploiting their knowledge of Mandarin, a closely related language, the students studied Cantonese with authentic material in a Cantonese-Mandarin parallel corpus, transcribed from television programs. They were given a list of Mandarin words that yield a range of possible Cantonese translations, depending on the linguistic context. Leveraging sentence and word alignments in the parallel corpus, the students independently searched for example sentences to discover these translation equivalents. Experimental results showed that, in both the short- and long-term, this data-driven learning approach helped students improve their knowledge of Cantonese vocabulary. These results suggest the potential of applying parallel corpora at even the beginners’ level for other L1-L2 pairs of closely related languages.

Type
Regular papers
Copyright
Copyright © European Association for Computer Assisted Language Learning 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Anthony, L. (2012) AntPConc. Tokyo: Waseda University. http://www.antlab.sci.waseda.ac.jpGoogle Scholar
Babych, S. (2015) Textual cohesion patterns for developing reading skills: A corpus-based multilingual learning environment. In Leńko-Szymańska, A. and Boulton, A. (eds.), Multiple affordances of language corpora for data-driven learning. 155175. Amsterdam: John Benjamins.Google Scholar
Barlow, M. (2000) Parallel texts in language teaching. In Botley, S. P., McEnery, M. A. and Wilson A. (eds.), Multilingual corpora in teaching and research. Amsterdam/Atlanta: Rodopi, 107115.Google Scholar
Boulton, A. (2008a) But where’s the proof? The need for empirical evidence for data-driven learning. In Edwardes, M. (ed.), Proceedings of BAAL annual conference 2007. London: Scitsiugnil Press, 1316.Google Scholar
Boulton, A. (2008b) Looking (for) empirical evidence of data-driven learning at lower levels. In Lewandowska-Tomaszczyk, B. (ed.), Corpus linguistics, computer tools, and applications: State of the art. Frankfurt: Peter Lang, 581598.Google Scholar
Boulton, A. (2008c) DDL: Reaching the parts other teaching can’t reach? Proceedings of the teaching and language corpora conference. Lisbon: Associação de Estudos e de Investigação Cientifíca do ISLA-Lisboa, 3844.Google Scholar
Boulton, A. (2009) Testing the limits of data-driven learning: Language proficiency and training. ReCALL, 21(1): 3754.Google Scholar
Boulton, A. (2010) Data-driven learning: Taking the computer out of the equation. Language learning, 60(3): 534572.CrossRefGoogle Scholar
Boulton, A. (2011) Data-driven learning: The perpetual enigma. In Goźdź-Roszkowski, S. (ed.), Explorations across languages and corpora. Frankfurt: Peter Lang, 563580.Google Scholar
Boulton, A. (2012) Language awareness and medium-term benefits of corpus consultation. In Gimeno Sanz, A. (ed.), New trends in CALL – working together. London: Macmillan, 3946. https://hal.archives-ouvertes.fr/hal-00502606v2/documentGoogle Scholar
Chan, T.-P. and Liou, H.-C. (2005) Effects of web-based concordancing instruction on EFL students’ learning of verb-noun collocations. Computer Assisted Language Learning, 18(3): 231250.CrossRefGoogle Scholar
Chang, L. L. (2007) The effects of using CALL on advanced Chinese foreign language learners. CALICO Journal, 24(2): 331353.Google Scholar
Chang, P.-C., Galley, M. and Manning, C. D. (2008) Optimizing Chinese word segmentation for machine translation performance. In Callison-Burch, C., Koehn, P., Monz, C. and Fordyce, C. S. (eds.), Proceedings of the 3rd workshop on statistical machine translation. Stroudsbury: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=1626430Google Scholar
Chujo, K., Anthony, L. and Oghigian, K. (2009) DDL for the EFL classroom: Effective uses of a Japanese-English parallel corpus and the development of a learner-friendly, online parallel concordancer. In Mahlberg, M., González-Díaz, V. and Smith, C. (eds.), Proceedings of the corpus linguistics conference (CL 2009). University of Liverpool. http://ucrel.lancs.ac.uk/publications/cl2009/48_FullPaper.docGoogle Scholar
Chujo, K., Oghigian, K., Anthony, L. and Yokota, K. (2013) Teaching remedial grammar through data-driven learning using AntPConc. Taiwan International ESP Journal, 5(2): 6590.Google Scholar
Erbaggio, P., Gopalakrishnan, S., Hobbs, S. and Liu, H. (2012) Enhancing student engagement through online authentic materials. The International Association for Language Learning Technology Journal, 42(2): 2751.Google Scholar
Gao, Z.-M. (2011) Exploring the effects and use of a Chinese-English parallel concordancer. Computer Assisted Language Learning, 24(3): 255275.Google Scholar
Geist, M. and Hahn, A. (2012) Using a corpus for written production: A classroom study. In Thomas, J. E. and Boulton, A. (eds.), Input, process and product: Developments in teaching and language corpora. Brno: Masaryk University Press, 123135.Google Scholar
Hannas, W. C. (1997) Asia’s orthographic dilemma. Honolulu: University of Hawaii Press.Google Scholar
Huang, H.-T. and Liou, H.-C. (2007) Vocabulary learning in an automated graded reading program. Language Learning & Technology, 11(3): 6182.Google Scholar
Johns, T. F. (1991) Should you be persuaded: Two samples of data-driven learning materials. In Johns, T. F. and King, P. (eds.), Classroom concordancing. English Language Research Journal. Birmingham: Birmingham University, 4: 113.Google Scholar
Johns, T. F. (1994) From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning. In Odlin, T. (ed.), Perspectives on pedagogical grammar. Cambridge: Cambridge University Press, 293313.Google Scholar
Johns, T. F. (1997) Contexts: The background, development and trialling of a concordance-based CALL program. In Wichmann, A., Fligelstone, S., McEnery, T. and Knowles, G. (eds.), Teaching and language corpora. London: Longman, 100115.Google Scholar
Ki, W. W. (2006) Computer-assisted perceptual learning of Cantonese tones. The 14th international conference on computers in education. Peking Normal University, 30/11/06.Google Scholar
Kilgarriff, A., Huang, C., Rychly, P., Smith, S. and Tugwell, D. (2005) Chinese word sketches. In Ooi, B. Y. V., Pakir, A., Talib, I. B. S., Tan, L., Tan, K. W. P. and Tan, Y. Y. (eds.), Words in Asian cultural context: Proceedings of the 4th ASIALEX conference. National University of Singapore, 1–3/06/05.Google Scholar
Kuo, M.-L. A. and Hooper, S. (2004) The effect of visual and verbal coding mnemonics on learning Chinese characters in computer-based instruction. Educational Technology Research and Development, 52(3): 2334.Google Scholar
Lam, H. C., Ki, W. W., Law, N., Chung, A. L. S., Ko, P. Y., Ho, A. H. S. and Pun, S. W. (2001) Designing CALL for learning Chinese characters. Journal of Computer Assisted Learning, 17(1): 115128.CrossRefGoogle Scholar
Lange, D. L. (1999) Planning for using the new national culture standards. In Phillips, J. and Terry, R. M. (eds.), Foreign language standards: Linking research, theories, and practices. Lincolnwood, IL: National Textbook Company, 57120.Google Scholar
Larimer, R. E. and Schleicher, L. (eds.) (1999) New ways in using authentic materials in the classroom. Alexandria, VA: Teachers of English to Speakers of Other Languages, Inc.Google Scholar
Lee, J. (2011) Toward a parallel corpus of spoken Cantonese and written Chinese. In Wang, H. and Yarowsky, D. (eds.), Proceedings of the 5th international joint conference on natural language processing. Chiang Mai: Asian Federation of Natural Language Processing. https://aclweb.org/anthology/I/I11/I11-1174.pdfGoogle Scholar
Lee, J. (2012) Corpus-based analysis of mixed code in Hong Kong speech. In Xiong, D., Castelli, E., Dong M. and Yen P. T. N. (eds.), Proceedings of 2012 international conference on Asian language processing. Hanoi: IEEE.Google Scholar
Lee, J., Hui, C. Y. and Kong, Y. H. (2013) Treebanking for data-driven research in the classroom. In Derzhanski, I. and Radev D. (eds.), Proceedings of the 4th workshop on teaching natural language processing. Stroudsburg: Association for Computational Linguistics. https://www.aclweb.org/anthology/W/W13/W13-3409.pdfGoogle Scholar
Li, D. C. S., Wong, C. S. P., Leung, W. M. and Wong, S. T. S. (2016) Facilitation of transference: The case of monosyllabic salience in Hong Kong Cantonese. Linguistics, 54(1).Google Scholar
Luk, R. W. P. and Ng, A. B. Y. (1998) Computer-assisted learning of Chinese idioms. Journal of Computer Assisted Learning, 14(1): 218.Google Scholar
Mair, V. H. (1991) What is a Chinese “dialect/topolect”? Reflections on some key Sino-English linguistic terms. Sino-Platonic Papers, 29: 131.Google Scholar
Matthews, S. and Yip, V. (2011) Cantonese: A comprehensive grammar. New York: Routledge.Google Scholar
Montero Perez, M., Paulussen, H., Macken, L. and Desmet, P. (2014) From input to output: the potential of parallel corpora for CALL. Language Resources and Evaluation, 48(1): 165189.Google Scholar
Nation, I. S. P. (2001) Learning vocabulary in another language. Cambridge: Cambridge University Press.Google Scholar
Nerbonne, J. (2000) Parallel texts in computer-assisted language learning. In Veronis, J. (ed.), Parallel text processing. Dordrecht and Boston: Kluwer, 354369.Google Scholar
Ōuyáng, J. (1993) Pŭtōnghuà Guăngzhōuhuà de bĭjiào yŭ xuéxí (The comparison and learning of Mandarin and Cantonese). Peking: China Social Science Press.Google Scholar
Poole, R. (2012) Concordance-based glosses for academic vocabulary acquisition. CALICO Journal, 29(4): 679693.Google Scholar
Ramsey, S. R. (1987) The languages of China. Princeton: Princeton University Press.Google Scholar
Rosell-Aguilar, F. and Kan, Q. (2015) Design and user evaluation of a mobile application to teach Chinese characters. JALT CALL journal, 11(1): 1940.Google Scholar
St. John, E. (2001) A case for using a parallel corpus and concordancer for beginners of a foreign language. Language Learning & Technology, 5(3): 185203.Google Scholar
Shei, C. and Hsieh, H.-P. (2012) Linkit: a CALL system for learning Chinese characters, words, and phrases. Computer Assisted Language Learning, 25(4): 319338.Google Scholar
Shī, Z. (2002) Guǎngzhōu yīn Běijīng yīn Duìyìng Shǒucè(A handbook on the correspondence between Cantonese pronunciation and Pekinese pronunciation). Canton: Jinan University Press.Google Scholar
Smith, S., Huang, C.-R., Kilgarriff, A. and Chen, M.-R. (2008) A corpus query tool for SLA: Learning Mandarin with the help of Sketch Engine. In Lewandowska-Tomaszczyk, B. (ed.), Corpus linguistics, computer tools, and applications – state of the art. Frankfurt: Peter Lang, 673686.Google Scholar
Tadmor, U., Haspelmath, M. and Taylor, B. (2010) Borrowability and the notion of basic vocabulary. Diachronica, 27(2): 226246.Google Scholar
Tian, S. (2004) Data-driven learning: Do learning tasks and proficiency make a difference? In Proceedings of the 9th conference of pan-Pacific association of applied linguistics. http://www.paaljapan.org/resources/proceedings/PAAL9/pdf/TianShiaup.pdfGoogle Scholar
Tono, Y., Satake, Y. and Miura, A. (2014) The effects of using corpora on revision tasks in L2 writing with coded error feedback. ReCALL, 26(2): 147162.Google Scholar
Wang, L. (2001) Exploring parallel concordancing in English and Chinese. Language Learning & Technology, 5(3): 174184.Google Scholar
Wong, L.-H., Chin, C.-K., Tan, C.-L. and Liu, M. (2010) Students’ personal and social meaning making in a Chinese idiom mobile learning environment. Educational Technology & Society, 13(4): 1526.Google Scholar
Wong, T.-S. (2010) A pilot study on the outcome of teaching phonological correspondence in Cantonese class for Mandarin speakers. The 2010 Annual research forum of the linguistic society of Hong Kong (LSHK-ARF 2010). The Chinese University of Hong Kong, 01/12/10.Google Scholar
Wu, Y. and Zhang, J. (2004) A Chinese language expert system using Bayesian learning. In Callaos, N., Lesso, W. and Sanchez, B. (eds.), Proceedings of the 8th world multiconference on systemics, cybernetics and informatics, Florida. http://facultyweb.cs.wwu.edu/~zhangj/home/papers/sci04-nlp.pdfGoogle Scholar
Yang, C. and Xie, Y. (2013) Learning Chinese idioms through iPads. Language Learning & Technology, 17(2): 1223.Google Scholar
Zeldes, A., Ritz, J., Lüdeling, A. and Chiarcos, C. (2009) ANNIS: A search tool for multi-layer annotated corpora. In Mahlberg, M., González-Diaz, V. and Smith C. (eds.), Proceedings of the corpus linguistics conference (CL2009). University of Liverpool, 20–23/07/09. http://ucrel.lancs.ac.uk/publications/cl2009/358_FullPaper.docGoogle Scholar
Zeng, Z. (1993) Colloquial Cantonese and Putonghua equivalents (4th edn.). (S. K. Lai, trans.). Hong Kong: Joint Publishing (Hong Kong) Company Limited.Google Scholar