The Impact of Implementing AI-Generated Audio Transcriptions on English Majors’ Cognitive Load
DOI:
https://doi.org/10.54855/acoj.251617Keywords:
artificial intelligence (AI), listening comprehension (Liscomp), cognitive load (CogL), AI-generated audio transcriptions (AIGATs), participants’ transcriptions (PTs)Abstract
AI has become a daily personal tutoring system to address the needs of English majors, particularly those seeking a revolution in listening methods. Automatic AI-Generated Audio transcriptions (AIGATs) can improve learners' listening comprehension (Cao, Yamashita, & Ishida, 2018); however, there are concerns that if AI transcriptions lack thoroughness, it may negatively affect learners' cognition. Within the confines of this study, we investigate how AIGATs engender a profound impact on 86 English majors’ cognitive load (CogL) and their perspectives towards the applications of AIGATs. The participants were divided into two groups: one was exposed to the listening practice sessions with AIGATs, and the rest with their own transcriptions (PTs). Data is collected through CogL scales on AIGATs and PTs groups, respectively. A semi-structured interview was conducted to examine the AIGATs group’s perspectives. The findings revealed statistically significant differences in the two groups’ CogL test scores. Using AIGATs helps students lower their CogL test scores and enhance their cognitive abilities in handling task complexity. This research provides valuable insights for integrating AI into language education, helping educators create more efficient language instruction methods for English learners in the digital age.References
Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2021). Effects of ASR-based websites on EFL learners’ vocabulary, speaking anxiety, and language enjoyment. System, 99, https://doi.org/10.1016/j.system.2021.102496 DOI: https://doi.org/10.1016/j.system.2021.102496
Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2022). ‘Look, I can speak correctly’: Learning vocabulary and pronunciation through websites equipped with automatic speech recognition technology. Computer Assisted Language Learning, 1-29. https://doi.org/10.1080/09588221.2022.2080230 DOI: https://doi.org/10.1080/09588221.2022.2080230
Becker, B. (2017). Artificial intelligence in education: what is it, where is it now, where is it going. In Ireland’s Yearbook of Education, 2018, pp. 42-46. Retrieved from http://educationmatters.ie/download-irelands-yearbook-education/
Benson, P. (2013). Teaching and researching: Autonomy in language learning. Routledge. DOI: https://doi.org/10.4324/9781315833767
Cai, Y. (2023). The application of automatic speech recognition technology in English as foreign language pronunciation learning. In Proceedings of the 2nd International Conference on Humanities, Wisdom Education and Service Management, pp. 356-360. doi:10.2991/978-2-38476-068-8_44 DOI: https://doi.org/10.2991/978-2-38476-068-8_44
Cao, X., Yamashita, N., & Ishida, T. (2018). Effects of automated transcripts on non-native speakers' listening comprehension. IEICE TRANSACTIONS on Information and Systems, 101(3), 730-739. https://doi.org/10.1587/transinf.2017EDP7255 DOI: https://doi.org/10.1587/transinf.2017EDP7255
Chan, W.S., Kruger J.-L. & Doherty, S. (2019). Comparing the impact of automatically generated and corrected subtitles on cognitive load and learning in a first-and second-language educational context. Linguistica Antverpiensia, New Series: Themes in Translation Studies, 18, 237–272. https://doi.org/10.52034/lanstts.v18i0.506 DOI: https://doi.org/10.52034/lanstts.v18i0.506
Chang, C. C., Warden, C. A., Liang, C., & Chou, P. N. (2018). Performance, cognitive load, and behaviour of technology‐assisted English listening learning: From CALL to MALL. Journal of Computer Assisted Learning, 34(2), 105-114. https://doi.org/10.1111/jcal.12218 DOI: https://doi.org/10.1111/jcal.12218
Chastain, K. (1971). The development of modern language skills: Theory practice. Philadelphia: Curriculum Development Center.
Chastain, K. (1988). Developing second language skills. (3rd ed.). U.S.A: Harcourt Brace, Jovanovich, Inc.
Creswell, J. W., & Creswell, J. D. (2017). Research design: Qualitative, quantitative, and mixed methods approaches. Sage publications.
Creswell, J. W., & Guetterman, T. C. (2019). Educational research: Planning, Conducting, and Evaluating Quantitative and Qualitative Research (6th ed.). Pearson.
de Jong, T. (2010). Cognitive load theory, educational research, and instructional design: Some food for thought. Instructional Science, 38(2), 105–134. https://doi.org/10.1007/s11251-009-9110-0 DOI: https://doi.org/10.1007/s11251-009-9110-0
Debue, N., & van de Leemput, C. (2014). What does germane load mean? An empirical contribution to the cognitive load theory. Frontiers in Psychology, 5(1099), 1–12. https://doi.org/10.3389/fpsyg.2014.01099 DOI: https://doi.org/10.3389/fpsyg.2014.01099
Elimat, A. K., & AbuSeileek, A. F. (2014). Automatic speech recognition technology as an eff ective means for teaching pronunciation. JALT Call Journal, 10(1), 21-47. Retrieved from https://files.eric.ed.gov/fulltext/EJ1107929.pdf DOI: https://doi.org/10.29140/jaltcall.v10n1.j166
Ferris, D. (1998). Students‘ views of academic aural/oral skills: A comparative needs analysis. TESOL Quarterly, 32(2), 289-318. https://doi.org/10.2307/3587585 DOI: https://doi.org/10.2307/3587585
Fraenkel, J. R., & Wallen, N. E. (2009). How to design and evaluate research in education. New York, NY: McGraw-Hill.
Gilakjani, A. P., & Ahmadi, M. R. (2011). A study of factors affecting EFL learners‘ English listening comprehension and the strategies for improvement. Journal of Language Teaching and Research, 2(5), 977-988. https://doi.org/10.4304/jltr.2.5.977-988 DOI: https://doi.org/10.4304/jltr.2.5.977-988
Gruetzemacher, R., & Whittlestone, J. (2019). Defining and unpacking transformative. arXiv:1912.00747. https://doi.org/10.48550/arXiv.1912.00747
Hamouda, A. (2013). An investigation of listening comprehension problems encountered by Saudi students in the EL listening classroom. International Journal of Academic Research in Progressive Education and Development, 2(2), 113-15. Retrieved from https://pdfs.semanticscholar.org/b811/984d6e30068a62a970b1f75b2e701e0b159e.pdf
Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for Teaching and Learning. Boston: Centre for Curriculum Redesign.
Holmes, W., & Tuomi, I. (2022). State of the art and practice in AI in education. European Journal of Education, 57(4), 542-570. https://doi.org/10.1111/ejed.12533 DOI: https://doi.org/10.1111/ejed.12533
Huang, J., Saled, S., Liu, Y. (2021). A review on Artificial Intelligence in education. Academic Journal of Interdisciplinary Studies, 10(3), 206-217. https://doi.org/10.36941/ajis-2021- 0077 DOI: https://doi.org/10.36941/ajis-2021-0077
Jiang, M. Y.-C., Jong, M. S.-Y., Lau, W. W.-F., Chai, C.-S., & Wu, N. (2021). Using automatic speech recognition technology to enhance EFL learners’ oral language complexity in a flipped classroom. Australasian Journal of Educational Technology, 37(2), 110–131. https://doi.org/10.14742/ajet.6798 DOI: https://doi.org/10.14742/ajet.6798
Jiang, M. Y.-C., Jong, M. S.-Y., Wu, N., Shen, B., Chai, C.-S., Lau, W.W.-F., & Huang, B. (2022). Integrating automatic speech recognition technology into vocabulary learning in a flipped English class for Chinese college students. Front. Psychol. 13:902429. https://doi.org/10.3389/fpsyg.2022.902429 DOI: https://doi.org/10.3389/fpsyg.2022.902429
Medha (2022, May 25). What is AI transcription? Everything you need to know. Fireflies. https://fireflies.ai/blog/what-is-ai-transcription
Leppink, J., Paas, F., van Gog, T., van der Vleuten, C. P. M., & van Merriënboer, J. J. G. (2014). Effects of pairs of problems and examples on task performance and different types of cognitive load. Learning and Instruction, 30(2), 32–42. https://doi.org/10.1016/j.learninstruc.2013.12.001 DOI: https://doi.org/10.1016/j.learninstruc.2013.12.001
Liu, M. (2023). Exploring the application of artificial intelligence in foreign language teaching: Challenges and future development. SHS Web of Conference, 168 https://doi.org/10.1051/shsconf/202316803025 DOI: https://doi.org/10.1051/shsconf/202316803025
Luckin, R., Holmes, W., Griffiths, M. & Forcier, L. B. (2016). Intelligence unleashed: An argument for AI in education. Retrieved from https://www.researchgate.net/publication/299561597
Malakul, S., & Park, I. (2023). The effects of using an auto-subtitle system in educational videos to facilitate learning for secondary school students: Learning comprehension, cognitive load, and satisfaction. Smart Learning Environments, 10(4). https://doi.org/10.1186/s40561-023-00224-2 DOI: https://doi.org/10.1186/s40561-023-00224-2
Mendelsohn, D. J. (1994). Learning to listen: A strategy-based approach for the second language learner. San Diego: Dominie Press.
Mhlanga, D. (2021). Artificial intelligence in the industry 4.0, and its impact on poverty, innovation, infrastructure development, and the sustainable development goals: Lessons from emerging economies? Sustainability, 13(11), 1-16. https://doi.org/10.3390/su13115788 DOI: https://doi.org/10.3390/su13115788
Mirzaei, M. S., Akita, Y., & Kawahara, T. (2014). Partial and synchronized captioning: A new tool for second language listening development. In S. Jager, L. Bradley, E. J. Meima, & S. Thouësny (Eds), CALL Design: Principles and practice; Proceedings of the 2014 EUROCALL Conference, Groningen, The Netherlands (pp. 230-236). Dublin: Research-publishing.net. https://doi.org/10.14705/rpnet.2014.000223
Morley, J. (1972). Improving aural comprehension. University of Michigan Press. DOI: https://doi.org/10.3998/mpub.9114
Ngo, K. T. (2024). The use of ChatGPT for vocabulary acquisition: A literature review. International Journal of AI in Language Education, 1(2), 1-17. https://doi.org/10.54855/ijaile.24121 DOI: https://doi.org/10.54855/ijaile.24121
Osada, N. (2004). Listening comprehension research: A brief review of the last thirty years. Dialogue, 3, 53-66. Retrieved from https://talk-waseda.net/dialogue/no03_2004/2004dialogue03_k4.pdf
Paas, F., & van Merriënboer, J. J. G. (1994). Instructional control of cognitive load in the training of complex cognitive tasks. Educational Psychology Review, 6(4), 351–371. https://doi.org/10.1007/BF02213420 DOI: https://doi.org/10.1007/BF02213420
Paas, F., Tuovinen, J. E., Tabbers, H., & van Gerven, P. W. M. (2003). Cognitive load measurement as a means to advance cognitive load theory. Educational Psychologist, 38(1), 63–71. https://doi.org/10.1207/s15326985ep3801_8 DOI: https://doi.org/10.1207/S15326985EP3801_8
Pallant, J. (2007). SPSS survival manual: A step by step guide to data analysis using SPSS for Windows (3rd ed.). Open University Press.
Pan, Y., Jiang, D., Picheny, M., & Qin, Y. (2009). Effects of real-time transcription on non- native speaker's comprehension in computer-mediated communications. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 2353-2356). https://doi.org/10.1145/1518701.1519061 DOI: https://doi.org/10.1145/1518701.1519061
Pan, Y., Jiang, D., Yao, L., Picheny, M., & Qin, Y. (2010). Effects of automated transcription quality on non-native speakers’ comprehension in real-time computer-mediated communication. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp.1725–1734). https://doi.org/10.1145/1753326.1753584 DOI: https://doi.org/10.1145/1753326.1753584
Rampersad, G. (2020). Robot will take your job: Innovation for an era of artificial intelligence. Journal of Business Research, 116, 68-74. https://doi.org/10.1016/j.jbusres.2020.05.019 DOI: https://doi.org/10.1016/j.jbusres.2020.05.019
Rost, M. (2011). Teaching and Researching Listening (2nd ed.). Pearson Education Limited.
Rubin, J. (1994). A review of second language listening comprehension research. The Modern Language Journal, 78(2), 199-221. https://doi.org/10.1111/j.1540-4781.1994.tb02034.x DOI: https://doi.org/10.1111/j.1540-4781.1994.tb02034.x
Sanders, T. J. M., & Gernsbacher, M. A. (2004). Accessibility in text and discourse processing: A special issue of discourse processes. Routledge. DOI: https://doi.org/10.1207/s15326950dp3702_1
Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: a mixed methods investigation. Front. Psychol. 14:1210187. https://doi.org/10.3389/fpsyg.2023.1210187 DOI: https://doi.org/10.3389/fpsyg.2023.1210187
Swanson, C. H. (1996). Who is listening in the classroom? A research paradigm. Paper presented at the Annual Convention of the International Listening Association, Sacramento, CA.
Sweller, J. (2010). Element interactivity and intrinsic, extraneous, and germane cognitive load. Educational Psychology Review, 22(2), 123–138. https://doi.org/10.1007/s10648-010-9128-5 DOI: https://doi.org/10.1007/s10648-010-9128-5
Sweller, J. (2011). Cognitive load theory. In J. P. Mestre & B. H. Ross (Series Eds.), The Psychology of Learning and Motivation: Vol. 55. Cognition in education (pp. 37–76). San Diego, CA: Academic Press. https://doi.org/10.1016/B978-0-12-387691-1.00002-8 DOI: https://doi.org/10.1016/B978-0-12-387691-1.00002-8
Sweller, J., & Sweller, S. (2006). Natural information processing systems. Evolutionary Psychology, 4(1), 434–458. https://doi.org/10.1177/147470490600400135 DOI: https://doi.org/10.1177/147470490600400135
Sweller, J., van Merriënboer, J. J. G., & Paas, F. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10(3), 251–296. https://doi.org/10.1023/A:1022193728205 DOI: https://doi.org/10.1023/A:1022193728205
Thi-Nhu Ngo, T., Hao-Jan Chen, H. & Kuo-Wei Lai, K. (2024). The effectiveness of automatic speech recognition in ESL/EFL pronunciation: A meta-analysis. ReCALL 36(1): 4–21. https://doi.org/10.1017/S0958344023000113 DOI: https://doi.org/10.1017/S0958344023000113
VITAC (2024, Jan 12). All about AI transcription: Benefits, use cases, and limitations. https://vitac.com/all-about-ai-transcription-benefits-use-cases-and-limitations/
Vo, T. H. C, & Cao, T. M. H. (2022). Investigating the effects of mass media on learning listening skills. AsiaCALL Online Journal, 13(5), 45-67. https://doi.org/10.54855/acoj.221354 DOI: https://doi.org/10.54855/acoj.221354
Wagner, E. (2004). A construct validation study of the extended listening sections of the ECPE and MELAB. Spaan Fellow Working Papers in Second or Foreign Language Assessment, 2, 1-26. Retrieved from https://www.semanticscholar.org/paper/A-Construct-Validation-Study-of-the-Extended-of-the-Wagner/05048468d9e3a62ccb3d0538b682ab4256c27550
Wilson, J. J. (2008). How to teach listening. London: Pearson.
Wolvin, A., Coakley, C. (1991). A survey of the status of listening training in some fortune 500 corporations. Communication Education, USA. DOI: https://doi.org/10.1080/03634529109378836
Yang, H.-Y. (2014). Does multimedia support individual differences? - EFL learners’ listening comprehension and cognitive load. Australasian Journal of Educational Technology, 30(6). https://doi.org/10.14742/ajet.639 DOI: https://doi.org/10.14742/ajet.639
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nguyen Ngoc Ly, Nguyen Thi Phuoc Loc
This work is licensed under a Creative Commons Attribution 4.0 International License.
License
Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository, in a journal or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process.
Copyright
The copyright of all articles published in the acoj remains with the Authors, i.e. Authors retain full ownership of their article. Permitted third-party reuse of the open access articles is defined by the applicable Creative Commons (CC) end-user license which is accepted by the Authors upon submission of their paper. All articles in the acoj are published under the CC BY-NC 4.0 license, meaning that end users can freely share an article (i.e. copy and redistribute the material in any medium or format) and adapt it (i.e. remix, transform and build upon the material) on the condition that proper attribution is given (i.e. appropriate credit, a link to the applicable license and an indication if any changes were made; all in such a way that does not suggest that the licensor endorses the user or the use) and the material is only used for non-commercial purposes.