Using Gemini for Formative Assessment in English Academic Writing - Critical Insights into The AI Tool’s Efficacy

Authors

DOI:

https://doi.org/10.54855/acoj.2516117

Keywords:

AI-powered tools, consistency, essay assessment, rubrics, band descriptors

Abstract

The emergence of Artificial Intelligence (AI) has triggered revolutionary transformations in language teaching and learning. When it comes to academic writing, current educational practitioners must more than once wonder which AI-powered tools, among the overwhelming number mushrooming recently, can assist their learners’ self-study by providing reliable and relevant feedback. This paper explores the effectiveness of Gemini, a large language model (LLM) developed by Google AI, in providing rubric-aligned commentary on student essays. The article employed a mixed-methods approach in which quantitative data are collected from academic writing samples while qualitative data are coded from Gemini-assisted feedback. Through the critical analysis of the comments provided by Gemini on twenty students’ essays, against the IELTS Writing Task 2 band descriptors, Gemini’s feedback tends to be more consistent when it comes to task achievement and coherence and cohesion, with rubric or band descriptors included in the prompt. Within each criterion in the rubric, the initial indicators tend to be more adequately examined. Also, paragraphing, spelling, and punctuation are the indicators that are neither consistently nor sufficiently commented on. These findings lay a foundation for language educators to evaluate the efficacy of LLM-assisted learning tools in academic writing education, paving the way for their proper application in classroom instruction.

Author Biographies

Nguyen Dinh Luat, Faculty of Foreign Languages, Industrial University of Ho Chi Minh City, Vietnam

Nguyen Dinh Luat is a lecturer at the Industrial University of Ho Chi Minh City, Vietnam. He has been teaching English macro skills, pronunciation, linguistics, and translation to a diverse range of learners. His research interests include technology applications in language teaching, language skill development, linguistics, and language testing.

Le Pham Thien Thu, Faculty of Foreign Languages, Industrial University of Ho Chi Minh City, Vietnam

Le Pham Thien Thu is a lecturer at the Industrial University of Ho Chi Minh City, Vietnam. She has more than 20 years of teaching experience for a diversity of levels and areas, focusing on teaching methodology, testing and assessment, and the four macro skills. Her research interests consist of technology applications in language teaching, language skills development, and teaching methodology. 

Le Thi Thuy, Faculty of Foreign Languages, Industrial University of Ho Chi Minh City, Vietnam

Le Thi Thuy is a lecturer at the Industrial University of Ho Chi Minh City, Vietnam. She has developed an interest in teaching English skills, reading, and writing to students at the tertiary level. She is passionate about researching technology integration in language instruction and learner autonomy. 

References

Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge University Press.

Anderson, C., & Anderson, C. (2005). Assessing writers. Heinemann.

Andrade, H., & Cizek, G. J. (Eds.). (2010). Handbook of formative assessment. Routledge. DOI: https://doi.org/10.4324/9780203874851

Bachman, L. F. (1990). Fundamental considerations in language testing. In B. D. Shavelson, R. J. Sternberg, & D. P. Berliner (Eds.), Evaluation: A comprehensive guide to theory and practice (pp. 357-385). Kluwer Academic Publishers.

Bailey, S. (2014). Academic writing: A handbook for international students. Routledge.

Biber, D., Conrad, S., Reppen, R., Byrd, P., Helt, M., Clark, V., … & Urzua, A. (2004). Representing language use in the university: Analysis of the TOEFL 2000 spoken and written academic language corpus. Test of English as a Foreign Language.

Black, P. J., & Wiliam, D. (1998). Inside the black box: Raising standards through assessment. Phi Delta Kappan, 80(2), 139-148.

Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21, 5-31. DOI: https://doi.org/10.1007/s11092-008-9068-5

Brookhart, S. (2013). How to create and use rubrics for formative assessment and grading. ASCD. DOI: https://doi.org/10.4135/9781452218649.n15

Brown, H. D., & Abeywickrama, P. (2019). Language assessment: Principles and classroom practices (3rd ed.). Pearson.

Cambridge University Press. (1989). Testing and assessment in language education. Cambridge University Press.

Chenoweth, N. A., & Hayes, J. R. (2001). Fluency in writing: Generating text in L1 and L2. Written Communication, 18(1), 80-98. DOI: https://doi.org/10.1177/0741088301018001004

Cumming, A. (2001). Learning to write in a second language: Two decades of research. International Journal of English Studies, 1(2), 1-23.

Cumming, A., Kantor, R., Powers, D., Santos, T., & Taylor, C. (2000). TOEFL 2000 writing framework. Educational Testing Service.

Dong, Y. (2023). Revolutionizing academic English writing through AI-powered pedagogy: Practical exploration of teaching process and assessment. Journal of Higher Education Research, 4(2), 52. https://doi.org/10.32629/jher.v4i2.1188 DOI: https://doi.org/10.32629/jher.v4i2.1188

Fulcher, G. (2015). Re-examining language testing: A philosophical and social inquiry. Routledge. DOI: https://doi.org/10.4324/9781315695518

Gardner, R. C. (2000). Correlation, causation, motivation, and second language acquisition. Canadian Psychology/Psychologie Canadienne, 41(1), 10. DOI: https://doi.org/10.1037/h0086854

Gikandi, J. W., Morrow, D., & Davis, N. E. (2011). Online formative assessment in higher education: A review of the literature. Computers & Education, 57(4), 2333-2351. https://doi.org/10.1016/j.compedu.2011.06.004 DOI: https://doi.org/10.1016/j.compedu.2011.06.004

Hamp-Lyons, L. (2003). Writing teachers as assessors of writing. In Exploring the dynamics of second language writing (pp. 162–189). DOI: https://doi.org/10.1017/CBO9781139524810.012

Harry, A., & Sayudin, S. (2023). Role of AI in education. Interdisciplinary Journal and Humanity (INJURITY), 2(3), 260-268. e-ISSN: 2963-4113 and p-ISSN: 2963-3397 DOI: https://doi.org/10.58631/injurity.v2i3.52

Hyland, K. (2019). Second language writing (2nd ed.). Cambridge University Press. DOI: https://doi.org/10.1017/9781108635547

Kartika, S. (2024). Enhancing writing proficiency through AI-powered feedback: A quasi-experimental study using Google Gemini. LinguaEducare: Journal of English and Linguistic Studies, 1(2), 83–96. https://doi.org/10.63324/h6q1ak58 DOI: https://doi.org/10.63324/h6q1ak58

Lang, G., Triantoro, T., & Sharp, J. H. (2024). Large language models as AI-powered educational assistants: Comparing GPT-4 and Gemini for writing teaching cases. Journal of Information Systems Education, 35(3), 390-407. https://doi.org/10.62273/YCIJ6454 DOI: https://doi.org/10.62273/YCIJ6454

Mahapatra, S. (2024). Impact of ChatGPT on ESL students’ academic writing skills: A mixed methods intervention study. Smart Learning Environments, 11(1), 9. https://doi.org/10.1186/s40561-024-00295-9 DOI: https://doi.org/10.1186/s40561-024-00295-9

Meenakumari, J. (2021). Harnessing the power of artificial intelligence for summative and formative assessments in higher education. EdTechReview. https://www.edtechreview.in/trends-insights/trends/power-of-ai-for-assessments-in-higher-ed/

Moskal, B. M., & Leydens, J. A. (2000). Scoring rubric development: Validity and reliability. Practical Assessment, Research, and Evaluation, 7(1), 10. ISSN:1531-7714

Mukminin, A. (2012). Acculturative experiences among Indonesian graduate students in U.S. higher education: Academic shock, adjustment, crisis, and resolution. Excellence in Higher Education (EHE), 3(1), 14-36. DOI: https://doi.org/10.5195/ehe.2012.64

Nicol, D. J., & Macfarlane‐Dick, D. (2006). Formative assessment and self‐regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199-218. https://doi.org/10.1080/03075070600572090 DOI: https://doi.org/10.1080/03075070600572090

Nunan, D. (1991). Language teaching methodology: A textbook for teachers. Prentice Hall.

Paltridge, B., & Starfield, S. (2016). Getting published in academic journals: Navigating the publication process. University of Michigan Press. DOI: https://doi.org/10.3998/mpub.5173299

Rosenfeld, M., Courtney, R., & Fowles, M. (2004). Identifying the writing tasks important for academic success at the undergraduate and graduate levels. ETS Research Report Series, 2004(2), i-91. DOI: https://doi.org/10.1002/j.2333-8504.2004.tb01969.x

Sasaki, M. (2000). Toward an empirical model of EFL writing processes: An exploratory study. Journal of Second Language Writing, 9(3), 259-291. https://doi.org/10.1016/S1060-3743(00)00028-X DOI: https://doi.org/10.1016/S1060-3743(00)00028-X

Steele, J. L. (2023). To GPT or not GPT? Empowering our students to learn with AI. Computers and Education: Artificial Intelligence, 5, 100160. https://doi.org/10.1016/j.caeai.2023.100160 DOI: https://doi.org/10.1016/j.caeai.2023.100160

Swales, J. M., & Feak, C. B. (2004). Academic writing for graduate students: Essential tasks and skills (Vol. 1). University of Michigan Press.

Talevski Dimitrija, J., Janusheva, V., & Pejchinovska, M. (2014). Formative assessment and its effects on the teaching practice. https://eprints.uklo.edu.mk/id/eprint/1172/1/Formative%20Assessment%20And%20Its%20Effects%20In%20The%20Teaching%20Practice.pdf

The University of Texas at Austin. (n.d.). Build-rubric. https://ctl.utexas.edu/sites/default/files/build-rubric.pdf

Truong, T. A. A., Le, H. K. N., & Nguyen, V. H. Q. (2025). English-Major Master's Students Regarding the Use of ChatGPT in Learning Research Writing at IUH. International Journal of AI in Language Education, 2(1), 92-115. https://doi.org/10.54855/ijaile.25215 DOI: https://doi.org/10.54855/ijaile.25216

Vy, N., & Pham, V. P. H. (2024). AI chatbots for language practices. International Journal of AI in Language Education, 1(1), 10–54855. https://doi.org/10.54855/ijaile.24115 DOI: https://doi.org/10.54855/ijaile.24115

Weissberg, B. (2000). Developmental relationships in the acquisition of English syntax: Writing vs. speech. Learning and Instruction, 10(1), 37-53. DOI: https://doi.org/10.1016/S0959-4752(99)00017-1

Weissberg, R. (2006). 13 Scaffolded feedback: Tutorial conversations with advanced L2 writers. In Feedback in second language writing: Contexts and issues (p. 246). DOI: https://doi.org/10.1017/CBO9781139524742.015

Wiseman, C. S. (2012). A comparison of the performance of analytic vs. holistic scoring rubrics to assess L2 writing. International Journal of Language Testing, 2(1), 59-92.

Xiao, C., Ma, W., Xu, S. X., Zhang, K., Wang, Y., & Fu, Q. (2024). From automation to augmentation: Large language models elevating the essay scoring landscape. arXiv e-prints. https://doi.org/10.1145/3706468.3706507 DOI: https://doi.org/10.1145/3706468.3706507

Downloads

Published

26-05-2025

How to Cite

Nguyen, D. L., Le, P. T. T., & Le, T. T. (2025). Using Gemini for Formative Assessment in English Academic Writing - Critical Insights into The AI Tool’s Efficacy. AsiaCALL Online Journal, 16(1), 328–343. https://doi.org/10.54855/acoj.2516117

Similar Articles

1 2 3 > >> 

You may also start an advanced similarity search for this article.