Assessment-related Tasks in Higher Education – Generative Artificial Intelligence Can Be a Practitioner’s Best Friend
Introduction
 
Since its inception, ChatGPT, a Generative Artificial Intelligence (GenAI) tool, has impacted education significantly (Baidoo-Anu and Owusu Ansah, 2023; Kim and Adlof, 2023). However, its widespread use is still approached cautiously by many. Despite this, advocates of GenAI continue to assert that this technology, when employed ethically, responsibly, and within an agreed framework, can offer broad utility, aid practitioners in various tasks and even improve productivity (Liu et al. 2023; Meniado, 2023).
 
 
This article aims to share experiences and insights accumulated from December 2022 to December 2023. During this period, my colleagues and I explored using GenAI in general and specifically experimented with XIPU AI, Xi’an Jiaotong-Liverpool University’s proprietary GenAI tool launched in August 2023. Our focus was on its application for assessment-related tasks in teaching English for Academic Purposes (EAP). The examples and recommendations provided are transferable to other contexts, as they constitute integral elements of a robust Higher Education (HE) assessment process. Key considerations encompass implementing a well-structured approach to assessing student learning aligned with learning outcomes (LOs), providing learning opportunities and feedback mechanisms, and engaging multiple stakeholders to uphold educational quality.
 
 
Assessment task design
 
ChatGPT revealed that traditional EAP assessment tasks like essays, reports, and summaries were no longer reliable for assessing language-related LOs. GenAI, with its Large Language Model (LLM), can complete these tasks to a passable standard (at least 40%), given its ability to produce believable output in correct English. Consequently, we faced early deliberations on whether these assessment task types should be discarded. Rather than dismissing them entirely, my colleagues and I chose to explore the redesign and modification of future assessment tasks to better align with the post-ChatGPT reality. We began subjecting assessment task drafts to GenAI to gauge their ‘GenAI-proof’ status. This process enabled us to pinpoint flaws and refine various aspects, recognising that while GenAI pursues objectivity by adhering to formal rules of logic, it falls short in offering insights into perspectives.
 
 
As a result, tasks involving elements that necessitate students to demonstrate higher-order thinking skills (HOTS), such as criticality, reflection, evaluation, inference, and ethical considerations, were deemed more suitable. Examples of these tasks include reflective reports, critical analysis essays, case study evaluations, research projects and the design of thinking challenges. These assignments entail consulting multiple sources, evaluating different perspectives, making critical inferences and considering ethical implications. That, naturally, involves reworking some parts of the curriculum because students need to develop these skills. Still, this approach pays dividends because it greatly reduces the risk of GenAI being used unethically by assessment takers. Another effective strategy for countering GenAI identified at this stage was linking assessment tasks to specific module materials and imposing time constraints. This ensures that students cannot rely solely on readily available knowledge generated by GenAI but must engage with carefully selected materials and make connections within specified timeframes. Finally, for future consideration, we propose another approach to the design of assessment tasks  – portfolio tasks.  Breaking down assessment tasks into parts offers several benefits, such as a more manageable workload for learners, multiple opportunities for feedback leading to improvement, and occasions to compare samples of students’ work for authenticity.
 
 
Pre-standardisation
 
The goal of pre-standardisation is to ensure that all members of the module team contribute, ensuring the creation of valid, reliable, authentic, and fair assessment tasks, and addressing any remaining issues.  These efforts also involve aligning LOs with the curriculum and selecting appropriate teaching and learning strategies. Unlike previous practices where only the Module Leader created pre-standardisation samples, GenAI played a pivotal role in this phase before Semester 1 in AY23-24, assisting in generating draft samples. Although the initial drafts were incomplete, they proved highly valuable in identifying and addressing issues, thereby leading to essential adjustments in the assessment tasks. One key revelation during this process was recognising that assigning the weighting of  50% to grammar and vocabulary in assessment tasks, which are not confined to live, on-site events, conflicted with the emerging post-ChatGPT reality. Given students' access to and active encouragement of the use XIPU AI, the allocation of half the overall mark for grammar and vocabulary achievement was deemed inappropriate due to the potential aid of GenAI to students. As a result, this matter is currently under consideration by the School of Languages (SoL), and new guidance related to assessment and curriculum AY24-25 is projected to be released shortly. This experience emphasised the critical need to re-evaluate marking rubrics in light of current circumstances. Additionally, the discussion on creating teaching and learning materials, including samples, concluded with a mutual agreement to leverage GenAI for these tasks.
 
 
Moreover, it was determined that integrating XIPU AI into lessons would showcase the practical use of the tool to students, raise awareness of its potential capabilities and limitations, and leverage this experience to collect insights and reflections for evaluating the tool's class usage and informing future actions. After all, there is no easy way to prevent students from using GenAI for completing assessment tasks (Chaudhry et al., 2023; Weber-Wulf et al.), and so it becomes imperative for us to take the initiative in showing learners how to use the tool responsibly and effectively. XIPU AI is available to both students and staff, and we illustrated this to learners by asking them to engage with the tool for brainstorming ideas, seeking explanations of complex concepts, narrowing research focuses, creating research terms, and summarising texts. These are some of the common tasks performed by GenAI in and out of the HE setting (Ansari et al., 2023; Moorhouse et al., 2023) and we cannot ignore that. 
 
 
Feedback
 
Providing formative feedback is an essential aspect of effective teaching, spanning beyond EAP to all subjects. Although time-consuming for educators, this form of feedback holds immense value for learners, fostering incremental improvement in their work and encouraging active engagement in the learning process. Recognising the potential of GenAI to assist in this endeavor, the team conducted experimentation in Semester 2 of AY23-24. This exploration revealed that inputting students’ work and seeking feedback from XIPU AI yielded targeted and specific comments on elements of their work, such as organisation, grammar, and vocabulary. While multiple attempts were needed to obtain satisfactory results, the majority of the generated feedback pertaining to those assessment categories proved usable. Regarding task completion, the team acknowledged that the GenAI tool was unable to provide comments on this aspect. Nonetheless, it was collectively agreed that by concentrating on scrutinising the students’ task completion, the team could offer more detailed and meaningful feedback tailored to students’ performance. Ultimately, the team concluded that while customisation of the generated feedback was necessary to include task-related comments and context-specific recommendations, utilising GenAI significantly reduced the time invested, allowing for the delivery of comprehensive and personalised feedback to students.
 
 
Conclusion
 
After a twelve-month period, my colleagues and I agreed that GenAI tools greatly assisted us with assessment-related matters, enhancing productivity and improving our output standards. Initially approaching GenAI as an adversary, our perspective evolved, leading to a mutually beneficial relationship. Nevertheless, we all feel that while this partnership is valuable, it also remains unpredictable due to ongoing technology updates that are likely to clash with University systems or policies. As HE practitioners, we understand the necessity of  integrating GenAI into our work, yet we cannot overlook the importance of staying informed and continuously adapting and reviewing our practices to maximise its benefits. GenAI is a useful ally, albeit with its unique challenges.
 
 
References
 
Ansari, A. N., Ahmad, S., & Bhutta, S. M. (2023). Mapping the global evidence around the use of ChatGPT in higher education: A systematic scoping review. Education and Information Technologies, 1-41.
Baidoo-Anu, D., & Ansah, L. O. (2023). Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Journal of AI, 7(1), 52-62.
Chaudhry, I. S., Sarwary, S. A. M., El Refae, G. A., & Chabchoub, H. (2023). Time to Revisit Existing Student’s Performance Evaluation Approach in Higher Education Sector in a New Era of ChatGPT—A Case Study. Cogent Education, 10(1), 2210461.
Kim, M., & Adlof, L. (2024). Adapting to the Future: ChatGPT as a Means for Supporting Constructivist Learning Environments. TechTrends, 68(1), 37-46.
Liu, M., Ren, Y., Nyagoga, L. M., Stonier, F., Wu, Z., & Yu, L. (2023). Future of education in the era of generative artificial intelligence: Consensus among Chinese scholars on applications of ChatGPT in schools. Future in Educational Research, 1(1), 72-101.
Meniado, J. C. (2023). The Impact of ChatGPT on English Language Teaching, Learning, and Assessment: A Rapid Review of Literature. Arab World English Journal, 14(4).
Moorhouse, B. L., Yeo, M. A., & Wan, Y. (2023). Generative AI tools and assessment: Guidelines of the world's top-ranking universities. Computers and Education Open, 5, 100151.
Weber-Wulff, D., Anohina-Naumeca, A., Bjelobaba, S., Foltýnek, T., Guerrero-Dib, J., Popoola, O., ... & Waddington, L. (2023). Testing of detection tools for AI-generated text. International Journal for Educational Integrity, 19(1), 26.
 
 

AUTHOR
Martina DORN
Senior Language Lecturer
English Language Center
School of Languages
XJTLU

DATE
03 March 2024

Related Articles