Hostname: page-component-8448b6f56d-42gr6 Total loading time: 0 Render date: 2024-04-24T05:04:13.928Z Has data issue: false hasContentIssue false

Advancing research in second language writing through computational tools and machine learning techniques: A research agenda

Published online by Cambridge University Press:  22 February 2013

Scott A. Crossley*
Affiliation:
Georgia State University, Atlanta, GA, USAscrossley@gsu.edu

Abstract

This paper provides an agenda for replication studies focusing on second language (L2) writing and the use of natural language processing (NLP) tools and machine learning algorithms. Specifically, it introduces a range of the available NLP tools and machine learning algorithms and demonstrates how these could be used to replicate seminal studies in L2 writing that concentrate on longitudinal writing development, predicting essay quality, examining differences between L1 and L2 writers, the effects of writing topics, and the effects of writing tasks. The paper concludes with implications for the recommended replication studies in the field of L2 writing and the advantages of using NLP tools and machine learning algorithms.

Type
Thinking Allowed
Copyright
Copyright © Cambridge University Press 2013

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Arnaud, P. J. L. (1992). Objective lexical and grammatical characteristics of L2 written compositions and the validity of separate-component tests. In Arnaud, P. J. L. & Bejoint, H. (eds.), Vocabulary and applied linguistics. London: Macmillan, 133145.CrossRefGoogle Scholar
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Bonzo, J. D. (2008). To assign a topic or not: Observing fluency and complexity in intermediate foreign language writing. Foreign Language Annals 41, 722735.CrossRefGoogle Scholar
Brown, C., Snodgrass, T., Kemper, S. J., Herman, R. & Covington, M. A. (2008). Automatic measurement of propositional idea density from part-of-speech tagging. Behavior Research Methods 40.2, 540545.CrossRefGoogle ScholarPubMed
Bynes, H., Maxim, H. H. & Norris, J. M. (2010). Realizing advanced L2 writing development in a collegiate curriculum: Curricular design, pedagogy, assessment. The Modern Language Journal 94, Monograph Supplement.Google Scholar
Carlman, N. (1986). Topic differences on writing tests: How much do they matter? English Quarterly 19, 3947.Google Scholar
Chung, C. K. & Pennebaker, J. W. (2012). Linguistic Inquiry and Word Count (LIWC): Pronounced ‘Luke’, . . . and other useful facts. In McCarthy, P. M. & Boonthum, C. (eds.), Applied natural language processing and content analysis: Identification, investigation, and resolution. Hershey, PA: IGI Global, 133145.Google Scholar
Cobb, T. & Horst, M. (2011). Does Word Coach coach words? CALICO 28.3, 639661.CrossRefGoogle Scholar
Connor, U. (1984). A study of cohesion and coherence in ESL students' writing. Papers in Linguistics: International Journal of Human Communication 17, 301316.CrossRefGoogle Scholar
Connor, U. (1990). Linguistic/rhetorical measures for international persuasive student writing. Research in the Teaching of English 24, 6787.Google Scholar
Crossley, S. A., McNamara, D. S., Weston, J. & McLain, S. T.Sullivan (2011). The development of writing proficiency as a function of grade level: A linguistic analysis. Written Communication 28.3, 282311.Google Scholar
Crossley, S. A. & McNamara, D. S. (2009). Computationally assessing lexical differences in L2 writing. Journal of Second Language Writing 17.2, 119135.CrossRefGoogle Scholar
Crossley, S. A, Salsbury, T. & McNamara, D. S. (2009). Measuring second language lexical growth using hypernymic relationships. Language Learning 59.2, 307334.CrossRefGoogle Scholar
Crossley, S. A., Salsbury, T. & McNamara, D. S. (2010). The development of polysemy and frequency use in English second language speakers. Language Learning 60.3, 573605.CrossRefGoogle Scholar
Cumming, A., Kantor, R., Baba, K., Erdoosy, U., Eouanzoui, K. & James, M. (2005). Differences in written discourse in writing-only and reading-to-write prototype tasks for next generation TOEFL. Assessing Writing 10, 543.CrossRefGoogle Scholar
Cumming, A., Kantor, R., Baba, K., Erdoosy, U., Eouanzoui, K. & James, M. (2006). Analysis of discourse features and verification of scoring levels for independent and integrated tasks for the new TOEFL (TOEFL Monograph No. MS-30). Princeton, NJ: ETS.Google Scholar
Engber, C. A. (1995). The relationship of lexical proficiency to the quality of ESL compositions. Journal of Second Language Writing 4.2, 139155.CrossRefGoogle Scholar
Esmaeili, H. (2002). Integrated reading and writing tasks and ESL students' reading and writing performance in an English language test. The Canadian Modern Language Review 58.4, 599622.CrossRefGoogle Scholar
Ferris, D. R. (1994). Lexical and syntactic features of ESL writing by students at different levels of L2 proficiency. TESOL Quarterly 28.2, 414420.CrossRefGoogle Scholar
Gebril, A. (2006). Writing-only and reading-to-write academic writing tasks: A study in generalizability and test method. Unpublished doctoral dissertation, University of Iowa.Google Scholar
Graesser, A. C., McNamara, D. S., Louwerse, M. & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers 36, 193202.CrossRefGoogle ScholarPubMed
Granger, S., Dagneaux, E., Meunier, F. & Paquot, M. (2009). The International Corpus of Learner English. Handbook and CD-ROM. Version 2. Louvain-la-Neuve: Presses Universitaires de Louvain.Google Scholar
Grant, L. & Ginther, A. (2000). Using computer-tagged linguistic features to describe L2 writing differences. Journal of Second Language Writing 9, 123145.CrossRefGoogle Scholar
Haswell, R. H. (2000). Documenting improvement in college writing: A longitudinal approach. Written Communication 17, 307352.CrossRefGoogle Scholar
Higgins, D., Xi, X., Zechner, K. & Williamson, D. (2011). A three-stage approach to the automated scoring of spontaneous spoken responses. Computer Speech and Language 25.2, 282306.CrossRefGoogle Scholar
Hinkel, E. (2002). Second language writers’ text. Mahwah, NJ: Lawrence Erlbaum.CrossRefGoogle Scholar
Hinkel, E. (2003). Simplicity without elegance: Features of sentences in L1 and L2 academic texts. TESOL Quarterly 37, 275301.CrossRefGoogle Scholar
Hinkel, E. (2009). The effects of essay topics on modal verb uses in L1 and L2 academic writing. Journal of Pragmatics 41, 667683.CrossRefGoogle Scholar
Horowitz, D. (1986). What professors actually require: Academic tasks for the ESL classroom. TESOL Quarterly 20, 445462.CrossRefGoogle Scholar
Just, M. A. & Carpenter, P. A. (1980). A theory of reading: From eye fixations to comprehension. Psychological Review 87, 329354.CrossRefGoogle ScholarPubMed
Language Teaching Review Panel (2008). Replication studies in language learning and teaching: Questions and answers, Language Teaching 41, 114.CrossRefGoogle Scholar
Laufer, B. (1994). The lexical profile of second language writing: Does it change over time? RELC Journal 25.2, 2133.CrossRefGoogle Scholar
Laufer, B. & Nation, I. S. P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics 16, 307322.CrossRefGoogle Scholar
Leki, I., Cumming, A. & Silva, T. (2008). A synthesis of research on second language writing in English. New York: Routledge.Google Scholar
Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL Quarterly 45.1, 3662.CrossRefGoogle Scholar
Lu, X. (in press). The relationship of lexical richness to the quality of ESL learners' oral narratives. The Modern Language Journal.Google Scholar
Matsuda, P. K. & Silva, T. J. (2005). Second language writing research: Perspective on the process of knowledge construction. Mahwah, NJ: Lawrence Erlbaum.Google Scholar
McCarthy, P. M. & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: a validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods 42, 381392.CrossRefGoogle ScholarPubMed
McCarthy, P. M., Watanabe, S. & Lamkin, T. A. (2012). The Gramulator: A tool to identify differential linguistic features of correlative text types. In McCarthy, P. M. & Boonthum, C. (eds.), Applied natural language processing and content analysis: Identification, investigation, and resolution. Hershey, PA: IGI Global, 312333.CrossRefGoogle Scholar
McCutchen, D. (1986). Domain knowledge and linguistic knowledge in the development of writing ability. Journal of Memory and Language 25, 431444.CrossRefGoogle Scholar
McNamara, D. S. & Graesser, A. C. (2012). Coh-Metrix. In McCarthy, P. M. & Boonthum, C. (eds.), Applied natural language processing and content analysis: Identification, investigation, and resolution. Hershey, PA: IGI Global, 188205.CrossRefGoogle Scholar
Pennebaker, J. W., Francis, M. E. & Booth, R. J. (2001). Linguistic Inquiry and Word Count (LIWC): LIWC2001. Mahwah, NJ: Lawrence Erlbaum.Google Scholar
Porte, G. K. (2012) Replication in applied linguistics research. Cambridge: Cambridge University Press.Google Scholar
Porte, G. K. & Richards, K. (2012). Replication in quantitative and qualitative research. Journal of Second Language Writing 21.3, 284293.CrossRefGoogle Scholar
Rayner, K. & Pollatsek, A. (1994). The psychology of reading. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
Reid, J. (1990). Responding to different topic types: A quantitative analysis from a contrastive rhetoric perspective. In Kroll, B. (ed.), Second language writing: Research insights for the classroom. Cambridge: Cambridge University Press, 191210.CrossRefGoogle Scholar
Reid, J. R. (1992). A computer text analysis of four cohesion devices in English discourse by native and nonnative writers. Journal of Second Language Writing 1.2, 79107.CrossRefGoogle Scholar
Silva, T. (1993). Toward an understanding of the distinct nature of L2 writing: The ESL research and its implications. TESOL Quarterly 27.4, 657675.CrossRefGoogle Scholar
Witten, I. A., Frank, E. & Hall, M. A. (2011). Data mining: Practical machine learning tools and techniques. San Francisco, CA: Elsevier.Google Scholar
Xue, G. & Nation, I. S. P. (1984). A university word list. Language Learning and Communication 3.2, 215229.Google Scholar
Zwaan, R. A., Langston, M. C. & Graesser, A. C. (1995). The construction of situation models in narrative comprehension: An event-indexing model. Psychological Science 6, 292297.CrossRefGoogle Scholar