Mixed Methods Research in Language Testing and Assessment

Eunice Eunhee Jang; Maryam Wagner; Gina Park

doi:10.1017/S0267190514000063

Mixed Methods Research in Language Testing and Assessment

Published online by Cambridge University Press: 22 December 2014

Eunice Eunhee Jang ,

Maryam Wagner and

Gina Park

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

As an alternative paradigm, mixed methods research (MMR), in general, endorses pluralism to understand the complex nature of a social world from multiple perspectives and multiple methodological lenses, each of which offers partial, yet valuable, insights. This methodological mixing is not limited to mixing of methods, but extends to the entire inquiry process. Researchers in language testing and assessment (LTA) are increasingly turning to MMR in order to understand the complexities of language acquisition and interaction among various language users, and also to expand opportunities to investigate validity claims beyond the three traditional facets of construct, content, and criterion validity. We use current conceptualizations of validity as a guiding framework to review 32 empirical MMR studies that have been published in LTA since 2007. Our systematic review encompassed multiple areas of foci, including the rationale for the use of MMR, evidence of collaboration, and synergetic effects. The analyses revealed several key trends including: (a) triangulation and complementarity were the prevalent uses of MMR in LTA; (b) the majority of the studies took place predominantly in higher education learning contexts with adult immigrant or university populations; (c) aspects of writing assessment were most frequently the focus of the studies (compared to other language modalities); (d) many of the studies explicitly addressed facets of validity, and others had significant implications for expanding notions of validity in LTA; (e) the majority of the studies avoided mixing at the data analysis stage by distinguishing data types and reporting results separately; and (f) integration occurred primarily at the discussion stage. We contend that LTA should embrace MMR through creative designs and integrative analytic strategies to seek new insights into the complexities and contexts of language testing and assessment.

Type: Research Article
Information: Annual Review of Applied Linguistics , Volume 34 , March 2014 , pp. 123 - 153

DOI: https://doi.org/10.1017/S0267190514000063 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

ANNOTATED BIBLIOGRAPHY

Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. New York, NY: Oxford University Press.

This volume thoroughly engages the reader through each major phase of research by addressing key issues, data collection, data analysis, and reporting. Each of these aforementioned sections contains discussions specific to qualitative, quantitative, and MMR, thereby providing the reader with integral information about MMR while also providing the reader with knowledge that can be used to compare and contrast the benefits and challenges to the use of each method in specific contexts. Further, the volume also includes a historical overview of the research paradigms and a summary of their strengths and weakness, along with a summary of Dörnyei's own paradigmatic stance, which highlights his call encouraging the use of MMR.

Morgan, D. L. (2007). Paradigms lost and pragmatism regained. Journal of Mixed Methods Research, 1, 48–76.

Morgan introduces this article by asking a pointed question: “To what extent is combining qualitative and quantitative methods simply about how we use methods, as opposed to raising basic issues about the nature of research methodology in the social sciences?” (p. 48). In order to respond to this question, this thought-provoking article takes the reader through a historical review of the developments in research methodology in the social sciences using the concept of paradigms as a conceptual framework. Morgan presents four conceptualizations of paradigms, which include paradigms as worldviews, epistemological stances, shared beliefs in a research field, and model examples. Through a discussion of paradigms, Morgan propagates a discussion of methodological issues that arise when researchers engage in MMR. Morgan's central thesis is that a “metaphysical” paradigm in the social sciences has been exhausted and should be replaced by a “pragmatic” paradigmatic approach (p. 55).

Tashakkori, A., & Teddlie, C. (Eds.). (2010). Handbook of mixed methods in social and behavioral research (2nd ed.). Thousand Oaks, CA: Sage.

This handbook, weighing in at close to 900 pages, is one of the most comprehensive resources for understanding and applying MMR in any field including LTA. The handbook is divided into three sections: (a) conceptual issues (philosophical, theoretical, sociopolitical); (b) issues regarding methods and methodology; and (c) contemporary applications of MMR. While not every chapter has a direct connection to education or assessment, there is much to be learned from the applications of MMR across these fields. The authors address emergent issues and present examples that provide learning opportunities about the different stances and purposes of MMR.

REFERENCES

Abbuhl, R., & Mackey, A. (2008). Second language acquisition research methods. In King, K. A. & Hornberger, N. H. (Eds.), Encyclopedia of language and education: Research methods in language and education (Vol. 10, pp. 1–13). Dordrecht, The Netherlands: Springer.Google Scholar

Anthony, J. J. (2009). Classroom computer experiences that stick: Two lenses on reflective timed essays. Assessing Writing, 14, 194–205.Google Scholar

Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 2, 1–34.CrossRef Google Scholar

Baker, B. A. (2010). Playing with the stakes: A consideration of an aspect of the social context of a gatekeeping writing assessment. Assessing Writing, 15, 133–153.Google Scholar

Baker, B. A. (2012). Individual differences in rater decision-making style: An exploratory mixed-methods study. Language Assessment Quarterly, 9, 225–248.Google Scholar

Barkaoui, K. (2010). Do ESL essay raters’ evaluation criteria change with experience? A mixed-methods, cross-sectional study. TESOL Quarterly, 44, 31–57.Google Scholar

Barkaoui, K. (2011). Effects of marking method and rater experience on ESL essay scores and rater performance. Assessment in Education: Principles, Policy, & Practice, 18, 279–293.Google Scholar

Bryman, A. (2007). Barriers to integrating quantitative and qualitative research. Journal of Mixed Methods Research, 1, 8–22.Google Scholar

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.CrossRef Google Scholar PubMed

Campbell, D. T., & Stanley, J. C. (1966). Experimental and quasi- experimental designs for research. Skokie, IL: Rand McNally.Google Scholar

Caracelli, V. J., & Greene, J. C. (1993). Data analysis strategies for mixed-method evaluation designs. Educational Evaluation and Policy Analysis, 15, 195–207.CrossRef Google Scholar

Colby-Kelly, C., & Turner, C. E. (2007). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. The Canadian Modern Language Review/La Revue Canadienne des Langues Vivantes, 64, 9–37.CrossRef Google Scholar

Cook, T. D. (1985). Postpositivist critical multiplism. In Shortland, R. L. & Mark, M. M. (Eds.), Social science and social policy (pp. 129–46). Newbury Park, CA: Sage.Google Scholar

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Skokie, IL: Rand McNally.Google Scholar

Creswell, J. W. (2007). Qualitative inquiry and research design: Choosing among five approaches (2nd ed.). Thousand Oaks, CA: Sage.Google Scholar

Creswell, J. W., & Plano Clark, V. L. (2007). Designing and conducting mixed methods research. Thousand Oaks, CA: Sage.Google Scholar

Cronbach, L. J. (1988). Five perspectives on validity argument. In Wainer, H. & Braun, H. (Eds.), Test validity (pp. 3–17). Hillsdale, NJ: Erlbaum.Google Scholar

Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.CrossRef Google Scholar PubMed

Datta, L. (1997). Multimethod evaluations: Using case studies together with other methods. In Chelimsky, E. & Shadish, W. R. (Eds.), Evaluation for the 21st century: A handbook (pp. 344–359). Thousand Oaks, CA: Sage.Google Scholar

Davies, A. (Ed.). (1997). Ethics in language testing [special issue]. Language Testing, 14.Google Scholar

Dellinger, A. B., & Leech, N. L. (2007). Toward a unified validation framework in mixed methods research. Journal of Mixed Methods Research, 1, 309–332.CrossRef Google Scholar

Denzin, N. K. (1978). The research act: A theoretical introduction to sociological methods. New York, NY: Praeger.Google Scholar

Derwing, T. M., Munro, M. J., & Thomson, R. I. (2008). A longitudinal study of ESL learners’ fluency and comprehensibility development. Applied Linguistics, 29, 359–380.Google Scholar

DiPardo, A., Storms, B. A., & Selland, M. (2011). Seeing voices: Assessing writerly stance in the NWP analytic writing continuum. Assessing Writing, 16, 170–188.Google Scholar

Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. New York, NY: Oxford University Press.Google Scholar

Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating scale construction. Language Testing, 13, 208–238.CrossRef Google Scholar

Gage, N. (1989). The paradigm wars and their aftermath: A “historical” sketch of research on teaching since 1989. Educational Researcher, 18, 4–10.Google Scholar

Ghanbari, B., Barati, H., & Moinzadeh, A. (2012). Problematizing rating scales in EFL academic writing assessment: Voices from Iranian context. English Language Teaching, 5, 76–90.CrossRef Google Scholar

Greene, J. C. (2007). Mixed methods in social inquiry. San Francisco, CA: Jossey-Bass.Google Scholar

Greene, J. C. (2008). Is mixed methods social inquiry a distinctive methodology? Journal of Mixed Methods Research, 2, 7–22.Google Scholar

Greene, J. C. (2011). The construct(ion) of validity as argument. In Chen, H. T., Donaldson, S. I., & Mark, M. M. (Eds.), Advancing validity in outcome evaluation: Theory and practice, New Directions for Evaluation (pp. 81–92). San Francisco, CA: Jossey-Bass.Google Scholar

Greene, J. C., & Caracelli, V. J. (1997). Defining and describing the paradigm issues in mixed-method evaluation. In Greene, J. C. & Caracelli, V. J. (Eds.), Advances in mixed-method evaluation: The challenges and benefits of integrating diverse paradigms (pp. 5–18). San Francisco, CA: Jossey-Bass.Google Scholar

Greene, J. C., Caracelli, V. J., & Graham, W. F. (1989). Toward a conceptual framework for mixed-method evaluation designs. Educational Evaluation and Policy Analysis, 11, 255–274.Google Scholar

Guilford, J. P. (1946). New standards for test evaluation. Educational and Psychological Measurement, 6, 427–439.Google Scholar

Guion, R. M. (1980). On trinitarian doctrines of validity. Professional Psychology, 11, 385–398.Google Scholar

Habermas, J. (1984). The theory of communicative action. Boston, MA: Beacon Press.Google Scholar

Hamid, M., Sussex, R., & Khan, A. (2009). Private tutoring in English for secondary school students in Bangladesh. TESOL Quarterly, 43, 281–308.Google Scholar

Harsch, C., & Martin, G. (2012). Adapting CEF-descriptors for rating purposes: Validation by a combined rater training and scale revision approach. Assessing Writing, 17, 228–250.Google Scholar

Hashemi, M. R. (2012). Reflections on mixing methods in applied linguistics research. Applied Linguistics, 33, 206–212.Google Scholar

Hesse-Biber, S., & Leavy, P (2006). Analysis and interpretation of qualitative data. In Hesse-Biber, S. & Leavy, P. (Eds.), The practice of qualitative research (pp. 343–374). London, UK: Sage.Google Scholar

Hyland, T. A. (2009). Drawing a line in the sand: Identifying the borderzone between self and other in EL1 and EL2 citation practices. Assessing Writing, 14, 62–74.Google Scholar

Isaacs, T. (2008). Toward defining a valid assessment criterion of pronunciation proficiency in non-native English-speaking graduate students. The Canadian Modern Language Review/La Revue Canadienne des Langues Vivantes, 64, 555–580.Google Scholar

Isaacs, T., & Trofimovich, P. (2012). Deconstructing comprehensibility. Studies in Second Language Acquisition, 34, 475–505.Google Scholar

Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for applying Fusion Model to LanguEdge assessment. Language Testing, 26, 31–73.Google Scholar

Jang, E. E. (2013). Mixed methods research in SLA. In Robinson, P. (Ed.), The Routledge encyclopedia of SLA (pp. 429–431). New York, NY: Routledge.Google Scholar

Jang, E. E., & Roussos, L. (2009). Integrative analytic approach to detecting and interpreting L2 vocabulary DIF. International Journal of Testing, 9, 238–259.Google Scholar

Johnson, R. B., & Onwuegbuzie, A. J. (2004). Mixed methods research: A research paradigm whose time has come. Educational Researcher, 33, 14–26.Google Scholar

Johnson, R. B., Onwuegbuzie, A. J., & Turner, L. A. (2007). Toward a definition of mixed methods research. Journal of Mixed Methods Research, 1, 112–133.CrossRef Google Scholar

Kane, M. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527–535.Google Scholar

Kane, M. (2002). Validating high stakes testing programs. Educational Measurement: Issues and Practice, 21, 31–41.Google Scholar

Kane, M. (2006). Validation. In Brennon, R. (Ed.), Educational Measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education and Praeger.Google Scholar

Kim, Y. (2009). An investigation into native and non-native teachers’ judgments of oral English performance: A mixed methods approach. Language Testing, 26, 187–217.CrossRef Google Scholar

Kim, Y., & Jang, E. E. (2009). Differential functioning of reading subskills on the OSSLT for L1 and ELL students: A multidimensionality model-based DBF/DIF approach. Language Learning, 59, 825–865.Google Scholar

Kirkhart, K. E. (2005). Through a cultural lens: Reflections on validity and theory in evaluation. In Hood, S., Hopson, R. K., & Frierson, H. T. (Eds.) The role of culture and cultural context: A mandate for inclusion, the discovery of truth, and understanding in evaluative theory and practice (pp. 21–39). Greenwich, CT: Information Age.Google Scholar

Knoch, U. (2009). Diagnostic assessment of writing: A comparison of two rating scales. Language Testing, 26, 275–304.Google Scholar

Lee, Y., & Greene, J. (2007). The predictive validity of an ESL placement test. Journal of Mixed Methods Research, 1, 366–389.Google Scholar

Lee, H., & Winke, P. (2013). The differences among three-, four-, and five-option-item formats in the context of a high-stakes English-language listening test. Language Testing, 30, 99–123.Google Scholar

Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage.Google Scholar

Magnan, S. S. (2006). From the editor: The MLJ turns 90 in a digital age. The Modern Language Journal, 90, 1–5.Google Scholar

Mathison, S. (1988). Why triangulate? Educational Researcher, 17, 13–17.Google Scholar

Maxwell, J. A. (1992). Understanding and validity in qualitative research. Harvard Educational Review, 62, 279–300.Google Scholar

Maxwell, J. A., & Loomis, D. M. (2003). Mixed methods design: An alternative approach. In Tashakkori, A. & Teddlie, C. (Eds.), Sage handbook of mixed methods in social & behavioral research (pp. 241–271). Thousand Oaks, CA: Sage.Google Scholar

Mertens, D. M. (2007). Transformative paradigm. Journal of Mixed Methods Research, 1, 212–225.Google Scholar

Mertens, D. M. (2010). Philosophy in mixed methods teaching: The transformative paradigm as illustration. International Journal of Multiple Research Approaches, 4, 9–18.Google Scholar

Messick, S. (1989). Validity. In Linn, R. L. (Ed.), Educational Measurement (3rd ed., pp. 13–103). New York, NY: Macmillan.Google Scholar

Miles, M. B., & Huberman, A. M. (1984). Qualitative data analysis: A sourcebook of new methods. Thousand Oaks, CA: Sage.Google Scholar

Mislevy, R. (1995). Test theory and language-learning assessment. Language Testing, 12, 341–369.Google Scholar

Morgan, D. L. (2007). Paradigms lost and pragmatism regained. Journal of Mixed Methods Research, 1, 48–76.Google Scholar

Moss, P. A. (1994). Can there be validity without reliability? Educational Researcher, 23, 5–12.Google Scholar

Nakatsuhara, F. (2011). Effects of test-taker characteristics and the number of participants in group oral tests. Language Testing, 28, 483–508.CrossRef Google Scholar

Nastasi, B. K., Hitchcock, J. H., & Brown, L. M. (2010). An inclusive framework for conceptualizing mixed methods design typologies: Moving toward fully integrated synergistic research models. In Tashakkori, A. & Teddlie, C. (Eds.), Sage handbook of mixed methods in social and behavioral research (pp. 305–338). Thousand Oaks, CA: Sage.Google Scholar

Onwuegbuzie, A. J., & Johnson, R. B. (2006). The validity issue in mixed research. Research in the Schools, 13, 48–63.Google Scholar

Patton, M. Q. (2002). Qualitative research and evaluation methods (3rd ed.). Thousand Oaks, CA: Sage.Google Scholar

Perrone, M. (2011). The effect of classroom-based assessment and language processing on the second language acquisition of EFL students. Journal of Adult Education, 40, 20–33.Google Scholar

Plakans, L., & Gebril, A. (2012). A close investigation into source use in integrated second language writing tasks. Assessing Writing, 17, 18–34.Google Scholar

Poonpon, K. (2011, April). “Synergy of mixed method approach to development of ESL speaking rating scale.” Paper presented at Doing Research in Applied Linguistics [conference], Bangkok, Thailand.Google Scholar

Reichardt, D. S., & Cook, T. D. (1979). Beyond qualitative versus quantitative methods. In Cook, T. D. & Reichardt, C. S. (Eds.), Qualitative and quantitative methods in evaluation research (pp. 7–32). Thousand Oaks, CA: Sage.Google Scholar

Reiss, A. J. (1968). Stuff and nonsense about social surveys and participant observation. In Becker, H. S., Geer, B., Riesman, D., & Weiss, R. S. (Eds.), Institutions and the person: Papers in memory of Everett C. Hughes. Chicago, IL: Aldine.Google Scholar

Schegloff, E. A. (1993). Reflections on quantification in the study of conversation. Research on Language and Social Interaction, 26, 99–128.Google Scholar

Schwandt, T. A., & Jang, E. E. (2004). Linking validity and ethics in language testing: Insights from the hermeneutic turn in social science. Studies in Educational Evaluation, 30, 265–280.Google Scholar

Shepard, L. A. (1992). What policy makers who mandate tests should know about the new psychology of intellectual ability and learning. In. Gifford, B. R. & O’Connor, M. C. (Eds.), Changing assessment: Alternative views of aptitude, achievement and instruction (pp. 301–328). Boston, MA: Kluwer.Google Scholar

Tashakkori, A., & Teddlie, C. (Eds.). (2003). Handbook of mixed methods in social and behavioral research. Thousand Oaks, CA: Sage.Google Scholar

Tashakkori, A., & Teddlie, C. (Eds.). (2010). Handbook of mixed methods in social and behavioral research (2nd ed.). Thousand Oaks, CA: Sage.Google Scholar

Teddlie, C., & Tashakkori, A. (2003). Major issues and controversies in the use of mixed methods in the social and behavioral sciences. In Tashakkori, A. & Teddlie, C. (Eds.), Handbook of mixed methods in social and behavioral research (pp. 3–51). Thousand Oaks, CA: Sage.Google Scholar

Teddlie, C., & Tashakkori, A. (2006). A general typology of research designs featuring mixed methods. Research in the Schools, 13, 12–28.Google Scholar

Turner, C. E. (2009). Examining washback in second language education contexts: A high stakes provincial exam and the teacher factor in classroom practice in Quebec secondary schools. International Journal on Pedagogies and Learning, 5, 103–123.Google Scholar

Uchikoshi, Y., & Maniates, H. (2010). How does bilingual instruction enhance English achievement? A mixed-methods study of Cantonese-speaking and Spanish-speaking bilingual classrooms. Bilingual Research Journal, 33, 364–385.Google Scholar

Weir, C. J. (2005). Language testing and validation. Basingstoke, UK: Palgrave Macmillan.Google Scholar

Winke, P. (2011). Evaluating the validity of a high-stakes ESL test: Why teachers’ perceptions matter. TESOL Quarterly, 45, 628–660.Google Scholar

Wiseman, C. S. (2012). Rater effects: Ego engagement in rater decision-making. Assessing Writing, 17, 150–173.Google Scholar

Xi, X. (2010). Aspects of performance on line graph description tasks: Influenced by graph familiarity and different task features. Language Testing, 27, 73–100.Google Scholar

Yin, M., Sims, J., & Cothran, D. (2012). Scratching where they itch: Evaluation of feedback on a diagnostic English grammar test for Taiwanese university students. Language Assessment Quarterly, 9, 78–104.Google Scholar

Yu, G. (2010). Effects of presentation mode and computer familiarity on summarization of extended texts. Language Assessment Quarterly, 7, 119–136.CrossRef Google Scholar

Article contents

Mixed Methods Research in Language Testing and Assessment

Abstract

Access options

References

ANNOTATED BIBLIOGRAPHY

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests