The Impact of Implementing AI-Generated Audio Transcriptions on English Majors’ Cognitive Load

Authors

DOI:

https://doi.org/10.54855/acoj.251617

Keywords:

artificial intelligence (AI), listening comprehension (Liscomp), cognitive load (CogL), AI-generated audio transcriptions (AIGATs), participants’ transcriptions (PTs)

Abstract

AI has become a daily personal tutoring system to address the needs of English majors, particularly those seeking a revolution in listening methods. Automatic AI-Generated Audio transcriptions (AIGATs) can improve learners' listening comprehension (Cao, Yamashita, & Ishida, 2018); however, there are concerns that if AI transcriptions lack thoroughness, it may negatively affect learners' cognition. Within the confines of this study, we investigate how AIGATs engender a profound impact on 86 English majors’ cognitive load (CogL) and their perspectives towards the applications of AIGATs. The participants were divided into two groups: one was exposed to the listening practice sessions with AIGATs, and the rest with their own transcriptions (PTs). Data is collected through CogL scales on AIGATs and PTs groups, respectively. A semi-structured interview was conducted to examine the AIGATs group’s perspectives. The findings revealed statistically significant differences in the two groups’ CogL test scores. Using AIGATs helps students lower their CogL test scores and enhance their cognitive abilities in handling task complexity. This research provides valuable insights for integrating AI into language education, helping educators create more efficient language instruction methods for English learners in the digital age.

Author Biographies

Nguyen Ngoc Ly, Ho Chi Minh City Open University, Ho Chi Minh City, Vietnam

Nguyen Ngoc Ly is a full-time lecturer at the Foreign Language Faculty of Ho Chi Minh City Open University. She earned her master’s degree at Ho Chi Minh City Open University, Vietnam. Her main research interests revolve around the areas of extensive listening & reading, technology in education, and teaching writing.

Nguyen Thi Phuoc Loc, Ho Chi Minh City Open University, Ho Chi Minh City, Vietnam

Nguyen Thi Phuoc Loc received her Master of Arts in Linguistics from Benedictine University, USA. She is currently a full-time lecturer at the Foreign Language Faculty of Ho Chi Minh City Open University. Her areas of research interest are learning variables, academic writing, and teaching methodology.

References

Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2021). Effects of ASR-based websites on EFL learners’ vocabulary, speaking anxiety, and language enjoyment. System, 99, https://doi.org/10.1016/j.system.2021.102496 DOI: https://doi.org/10.1016/j.system.2021.102496

Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2022). ‘Look, I can speak correctly’: Learning vocabulary and pronunciation through websites equipped with automatic speech recognition technology. Computer Assisted Language Learning, 1-29. https://doi.org/10.1080/09588221.2022.2080230 DOI: https://doi.org/10.1080/09588221.2022.2080230

Becker, B. (2017). Artificial intelligence in education: what is it, where is it now, where is it going. In Ireland’s Yearbook of Education, 2018, pp. 42-46. Retrieved from http://educationmatters.ie/download-irelands-yearbook-education/

Benson, P. (2013). Teaching and researching: Autonomy in language learning. Routledge. DOI: https://doi.org/10.4324/9781315833767

Cai, Y. (2023). The application of automatic speech recognition technology in English as foreign language pronunciation learning. In Proceedings of the 2nd International Conference on Humanities, Wisdom Education and Service Management, pp. 356-360. doi:10.2991/978-2-38476-068-8_44 DOI: https://doi.org/10.2991/978-2-38476-068-8_44

Cao, X., Yamashita, N., & Ishida, T. (2018). Effects of automated transcripts on non-native speakers' listening comprehension. IEICE TRANSACTIONS on Information and Systems, 101(3), 730-739. https://doi.org/10.1587/transinf.2017EDP7255 DOI: https://doi.org/10.1587/transinf.2017EDP7255

Chan, W.S., Kruger J.-L. & Doherty, S. (2019). Comparing the impact of automatically generated and corrected subtitles on cognitive load and learning in a first-and second-language educational context. Linguistica Antverpiensia, New Series: Themes in Translation Studies, 18, 237–272. https://doi.org/10.52034/lanstts.v18i0.506 DOI: https://doi.org/10.52034/lanstts.v18i0.506

Chang, C. C., Warden, C. A., Liang, C., & Chou, P. N. (2018). Performance, cognitive load, and behaviour of technology‐assisted English listening learning: From CALL to MALL. Journal of Computer Assisted Learning, 34(2), 105-114. https://doi.org/10.1111/jcal.12218 DOI: https://doi.org/10.1111/jcal.12218

Chastain, K. (1971). The development of modern language skills: Theory practice. Philadelphia: Curriculum Development Center.

Chastain, K. (1988). Developing second language skills. (3rd ed.). U.S.A: Harcourt Brace, Jovanovich, Inc.

Creswell, J. W., & Creswell, J. D. (2017). Research design: Qualitative, quantitative, and mixed methods approaches. Sage publications.

Creswell, J. W., & Guetterman, T. C. (2019). Educational research: Planning, Conducting, and Evaluating Quantitative and Qualitative Research (6th ed.). Pearson.

de Jong, T. (2010). Cognitive load theory, educational research, and instructional design: Some food for thought. Instructional Science, 38(2), 105–134. https://doi.org/10.1007/s11251-009-9110-0 DOI: https://doi.org/10.1007/s11251-009-9110-0

Debue, N., & van de Leemput, C. (2014). What does germane load mean? An empirical contribution to the cognitive load theory. Frontiers in Psychology, 5(1099), 1–12. https://doi.org/10.3389/fpsyg.2014.01099 DOI: https://doi.org/10.3389/fpsyg.2014.01099

Elimat, A. K., & AbuSeileek, A. F. (2014). Automatic speech recognition technology as an eff ective means for teaching pronunciation. JALT Call Journal, 10(1), 21-47. Retrieved from https://files.eric.ed.gov/fulltext/EJ1107929.pdf DOI: https://doi.org/10.29140/jaltcall.v10n1.j166

Ferris, D. (1998). Students‘ views of academic aural/oral skills: A comparative needs analysis. TESOL Quarterly, 32(2), 289-318. https://doi.org/10.2307/3587585 DOI: https://doi.org/10.2307/3587585

Fraenkel, J. R., & Wallen, N. E. (2009). How to design and evaluate research in education. New York, NY: McGraw-Hill.

Gilakjani, A. P., & Ahmadi, M. R. (2011). A study of factors affecting EFL learners‘ English listening comprehension and the strategies for improvement. Journal of Language Teaching and Research, 2(5), 977-988. https://doi.org/10.4304/jltr.2.5.977-988 DOI: https://doi.org/10.4304/jltr.2.5.977-988

Gruetzemacher, R., & Whittlestone, J. (2019). Defining and unpacking transformative. arXiv:1912.00747. https://doi.org/10.48550/arXiv.1912.00747

Hamouda, A. (2013). An investigation of listening comprehension problems encountered by Saudi students in the EL listening classroom. International Journal of Academic Research in Progressive Education and Development, 2(2), 113-15. Retrieved from https://pdfs.semanticscholar.org/b811/984d6e30068a62a970b1f75b2e701e0b159e.pdf

Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for Teaching and Learning. Boston: Centre for Curriculum Redesign.

Holmes, W., & Tuomi, I. (2022). State of the art and practice in AI in education. European Journal of Education, 57(4), 542-570. https://doi.org/10.1111/ejed.12533 DOI: https://doi.org/10.1111/ejed.12533

Huang, J., Saled, S., Liu, Y. (2021). A review on Artificial Intelligence in education. Academic Journal of Interdisciplinary Studies, 10(3), 206-217. https://doi.org/10.36941/ajis-2021- 0077 DOI: https://doi.org/10.36941/ajis-2021-0077

Jiang, M. Y.-C., Jong, M. S.-Y., Lau, W. W.-F., Chai, C.-S., & Wu, N. (2021). Using automatic speech recognition technology to enhance EFL learners’ oral language complexity in a flipped classroom. Australasian Journal of Educational Technology, 37(2), 110–131. https://doi.org/10.14742/ajet.6798 DOI: https://doi.org/10.14742/ajet.6798

Jiang, M. Y.-C., Jong, M. S.-Y., Wu, N., Shen, B., Chai, C.-S., Lau, W.W.-F., & Huang, B. (2022). Integrating automatic speech recognition technology into vocabulary learning in a flipped English class for Chinese college students. Front. Psychol. 13:902429. https://doi.org/10.3389/fpsyg.2022.902429 DOI: https://doi.org/10.3389/fpsyg.2022.902429

Medha (2022, May 25). What is AI transcription? Everything you need to know. Fireflies. https://fireflies.ai/blog/what-is-ai-transcription

Leppink, J., Paas, F., van Gog, T., van der Vleuten, C. P. M., & van Merriënboer, J. J. G. (2014). Effects of pairs of problems and examples on task performance and different types of cognitive load. Learning and Instruction, 30(2), 32–42. https://doi.org/10.1016/j.learninstruc.2013.12.001 DOI: https://doi.org/10.1016/j.learninstruc.2013.12.001

Liu, M. (2023). Exploring the application of artificial intelligence in foreign language teaching: Challenges and future development. SHS Web of Conference, 168 https://doi.org/10.1051/shsconf/202316803025 DOI: https://doi.org/10.1051/shsconf/202316803025

Luckin, R., Holmes, W., Griffiths, M. & Forcier, L. B. (2016). Intelligence unleashed: An argument for AI in education. Retrieved from https://www.researchgate.net/publication/299561597

Malakul, S., & Park, I. (2023). The effects of using an auto-subtitle system in educational videos to facilitate learning for secondary school students: Learning comprehension, cognitive load, and satisfaction. Smart Learning Environments, 10(4). https://doi.org/10.1186/s40561-023-00224-2 DOI: https://doi.org/10.1186/s40561-023-00224-2

Mendelsohn, D. J. (1994). Learning to listen: A strategy-based approach for the second language learner. San Diego: Dominie Press.

Mhlanga, D. (2021). Artificial intelligence in the industry 4.0, and its impact on poverty, innovation, infrastructure development, and the sustainable development goals: Lessons from emerging economies? Sustainability, 13(11), 1-16. https://doi.org/10.3390/su13115788 DOI: https://doi.org/10.3390/su13115788

Mirzaei, M. S., Akita, Y., & Kawahara, T. (2014). Partial and synchronized captioning: A new tool for second language listening development. In S. Jager, L. Bradley, E. J. Meima, & S. Thouësny (Eds), CALL Design: Principles and practice; Proceedings of the 2014 EUROCALL Conference, Groningen, The Netherlands (pp. 230-236). Dublin: Research-publishing.net. https://doi.org/10.14705/rpnet.2014.000223

Morley, J. (1972). Improving aural comprehension. University of Michigan Press. DOI: https://doi.org/10.3998/mpub.9114

Ngo, K. T. (2024). The use of ChatGPT for vocabulary acquisition: A literature review. International Journal of AI in Language Education, 1(2), 1-17. https://doi.org/10.54855/ijaile.24121 DOI: https://doi.org/10.54855/ijaile.24121

Osada, N. (2004). Listening comprehension research: A brief review of the last thirty years. Dialogue, 3, 53-66. Retrieved from https://talk-waseda.net/dialogue/no03_2004/2004dialogue03_k4.pdf

Paas, F., & van Merriënboer, J. J. G. (1994). Instructional control of cognitive load in the training of complex cognitive tasks. Educational Psychology Review, 6(4), 351–371. https://doi.org/10.1007/BF02213420 DOI: https://doi.org/10.1007/BF02213420

Paas, F., Tuovinen, J. E., Tabbers, H., & van Gerven, P. W. M. (2003). Cognitive load measurement as a means to advance cognitive load theory. Educational Psychologist, 38(1), 63–71. https://doi.org/10.1207/s15326985ep3801_8 DOI: https://doi.org/10.1207/S15326985EP3801_8

Pallant, J. (2007). SPSS survival manual: A step by step guide to data analysis using SPSS for Windows (3rd ed.). Open University Press.

Pan, Y., Jiang, D., Picheny, M., & Qin, Y. (2009). Effects of real-time transcription on non- native speaker's comprehension in computer-mediated communications. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 2353-2356). https://doi.org/10.1145/1518701.1519061 DOI: https://doi.org/10.1145/1518701.1519061

Pan, Y., Jiang, D., Yao, L., Picheny, M., & Qin, Y. (2010). Effects of automated transcription quality on non-native speakers’ comprehension in real-time computer-mediated communication. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp.1725–1734). https://doi.org/10.1145/1753326.1753584 DOI: https://doi.org/10.1145/1753326.1753584

Rampersad, G. (2020). Robot will take your job: Innovation for an era of artificial intelligence. Journal of Business Research, 116, 68-74. https://doi.org/10.1016/j.jbusres.2020.05.019 DOI: https://doi.org/10.1016/j.jbusres.2020.05.019

Rost, M. (2011). Teaching and Researching Listening (2nd ed.). Pearson Education Limited.

Rubin, J. (1994). A review of second language listening comprehension research. The Modern Language Journal, 78(2), 199-221. https://doi.org/10.1111/j.1540-4781.1994.tb02034.x DOI: https://doi.org/10.1111/j.1540-4781.1994.tb02034.x

Sanders, T. J. M., & Gernsbacher, M. A. (2004). Accessibility in text and discourse processing: A special issue of discourse processes. Routledge. DOI: https://doi.org/10.1207/s15326950dp3702_1

Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: a mixed methods investigation. Front. Psychol. 14:1210187. https://doi.org/10.3389/fpsyg.2023.1210187 DOI: https://doi.org/10.3389/fpsyg.2023.1210187

Swanson, C. H. (1996). Who is listening in the classroom? A research paradigm. Paper presented at the Annual Convention of the International Listening Association, Sacramento, CA.

Sweller, J. (2010). Element interactivity and intrinsic, extraneous, and germane cognitive load. Educational Psychology Review, 22(2), 123–138. https://doi.org/10.1007/s10648-010-9128-5 DOI: https://doi.org/10.1007/s10648-010-9128-5

Sweller, J. (2011). Cognitive load theory. In J. P. Mestre & B. H. Ross (Series Eds.), The Psychology of Learning and Motivation: Vol. 55. Cognition in education (pp. 37–76). San Diego, CA: Academic Press. https://doi.org/10.1016/B978-0-12-387691-1.00002-8 DOI: https://doi.org/10.1016/B978-0-12-387691-1.00002-8

Sweller, J., & Sweller, S. (2006). Natural information processing systems. Evolutionary Psychology, 4(1), 434–458. https://doi.org/10.1177/147470490600400135 DOI: https://doi.org/10.1177/147470490600400135

Sweller, J., van Merriënboer, J. J. G., & Paas, F. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10(3), 251–296. https://doi.org/10.1023/A:1022193728205 DOI: https://doi.org/10.1023/A:1022193728205

Thi-Nhu Ngo, T., Hao-Jan Chen, H. & Kuo-Wei Lai, K. (2024). The effectiveness of automatic speech recognition in ESL/EFL pronunciation: A meta-analysis. ReCALL 36(1): 4–21. https://doi.org/10.1017/S0958344023000113 DOI: https://doi.org/10.1017/S0958344023000113

VITAC (2024, Jan 12). All about AI transcription: Benefits, use cases, and limitations. https://vitac.com/all-about-ai-transcription-benefits-use-cases-and-limitations/

Vo, T. H. C, & Cao, T. M. H. (2022). Investigating the effects of mass media on learning listening skills. AsiaCALL Online Journal, 13(5), 45-67. https://doi.org/10.54855/acoj.221354 DOI: https://doi.org/10.54855/acoj.221354

Wagner, E. (2004). A construct validation study of the extended listening sections of the ECPE and MELAB. Spaan Fellow Working Papers in Second or Foreign Language Assessment, 2, 1-26. Retrieved from https://www.semanticscholar.org/paper/A-Construct-Validation-Study-of-the-Extended-of-the-Wagner/05048468d9e3a62ccb3d0538b682ab4256c27550

Wilson, J. J. (2008). How to teach listening. London: Pearson.

Wolvin, A., Coakley, C. (1991). A survey of the status of listening training in some fortune 500 corporations. Communication Education, USA. DOI: https://doi.org/10.1080/03634529109378836

Yang, H.-Y. (2014). Does multimedia support individual differences? - EFL learners’ listening comprehension and cognitive load. Australasian Journal of Educational Technology, 30(6). https://doi.org/10.14742/ajet.639 DOI: https://doi.org/10.14742/ajet.639

Downloads

Published

27-01-2025

How to Cite

Nguyen, N. L., & Nguyen, T. P. L. (2025). The Impact of Implementing AI-Generated Audio Transcriptions on English Majors’ Cognitive Load. AsiaCALL Online Journal, 16(1), 140–158. https://doi.org/10.54855/acoj.251617

Similar Articles

1 2 3 > >> 

You may also start an advanced similarity search for this article.