Exploring the use of generative artificial intelligence in developing educational videos for language learning

Abstract

The increasing adoption of generative artificial intelligence (GenAI) in education indicates its immense potential in teaching and learning. GenAI has shown its advanced capabilities to generate human-like language and realistic human avatars. However, the utility of GenAI in designing and producing video-based content for academic purposes is still a new area. This article explores using D_ID, a tool for creating videos of animated digital humans, and XIPU AI, a platform based on advanced large language models, to develop educational videos for an English for Academic Purposes (EAP) course. The author shares experiences and recommendations for creating video-based teaching materials that cooperate with the GenAI tools. This attempt aims to enhance our understanding of the potential for developing educational materials by integrating GenAI technologies. 
 
Keywords: avatars, large language models, generative artificial intelligence, XIPU AI, educational videos, language teaching materials
 

Introduction

Generative artificial intelligence (GenAI), a new type of AI, has gained increasing attention in the language learning field. However, the application of GenAI in language education still requires further research (Law, 2024). The utility of GenAI tools in developing teaching materials has been less studied. This article explores the use of two GenAI tools, D_ID and XIPU AI, in developing video-based teaching materials. It first reviews current studies using large language models (LLMs) such as ChatGPT and AI-created avatars in educational content. Secondly, it shares the experiences of creating video scripts and avatars with GenAI tools for classroom-based teaching materials for an English for Academic Purposes (EAP) course. Finally, it concludes with suggestions for integrating AI tools into producing educational videos.
 

Large Language models (LLMs)

GenAI is defined as a new type of AI that 'can be used to create new content, such as words, images, music, code, or video' (What Is Generative AI, no date). Unlike AI, which uses machine learning algorithms or past data to analyse or make predictions, GenAI uses large language models (LLMs), vast datasets of images, arts and videos to generate new texts, content, images, audio or videos (Law, 2024). The strengths of LLMs, such as ChatGPT, have been proven in processing and generating natural language to automate various tasks and provide ideas (Cai et al., 2023). For teaching and learning languages, the LLMs may make summaries, create pedagogical materials, design assessments for proficiencies at different levels, and provide feedback for students' writing tasks. The LLMs do not only possess AI intelligence by creating coherent and grammatically correct texts; researchers further suggest the LLMs demonstrate creative potential in their current study (Hubert, Awa, and Zabelina, 2024). Despite its excellence in producing natural language for a wide range of tasks, disadvantages exist to its application. Some bias, false information and apparent mistakes, also called hallucinations, may happen when applying LLMs (Cai et al., 2023). In addition, the articles generated by AI were found to have accuracy and quality issues compared to human-written ones (Ariyaratne et al., 2023). With an awareness of the pitfalls and challenges of adopting GenAI, educators are still experimenting with the effective use of LLMs in learning and teaching (Hinman, 2023; Law, 2024).
 

AI-created realistic avatars

Realistic avatars refer to digital humans replicating natural human appearance and facial expressions (Seymour, Riemer and Kay, 2018). AI-generated avatars and virtual media applications are widely seen in business video-based content; however, they are uncommon for teaching and learning purposes and have limited education research (Leiker et al., 2023; Vallis et al., 2023). One concern about AI-generated avatars is a phenomenon called "uncanny valley", a theory that onlookers may have an aversion toward a digital human that looks almost similar to a human (Seymour et al., 2021). Researchers have pointed out that people react differently towards the avatars; the uncanny valley's effect has individual differences (Seymour et al., 2021). Even though realistic digital humans may not seem impeccable, some researchers suggest that 'the human-realistic avatar has crossed the uncanny valley' with high possibilities (Seymour et al., 2021, p.608).
 
Although it is a relatively new area concerning its implications in teaching and learning, synthetic media with AI-generated avatars have been experimented with in educational content. Some researchers examined students' learning experiences with AI-generated avatars as lecturers. Students held positive perceptions and even some preferences towards implementing AI-generated characters as lecturers in the instructional videos in a business ethics course (Vallis et al., 2023). One study compared the effects of AI-generated synthetic video with a realistic AI-generated avatar and a traditional instructor-made video on students' learning performance and learning experiences in an online learning setting. The results showed that students made noticeable improvements in both conditions. However, no significant differences were found between the two in terms of both students' gains and perceptions of the two methods. The researchers suggest that AI-generated synthetic videos could be a feasible approach for producing videos, except for the traditional ones made by lecturers (Leiker et al.,2023). One study investigated users' preferences, natural liking, and trustworthiness towards two levels of realism: a human-realistic avatar and a cartoon caricature. The results showed that the participants who interacted with the avatars in discussions in a VR environment held a positive attitude, and the participants who observed it via 3-D VR devices or a 2D screen rated it higher for the human-realistic avatar even though they had preferences for both (Seymour et al., 2021). The researchers suggest people may overcome the uncanny valley phenomenon with realistic digital humans (Seymour et al., 2021). The current studies have shown some positive results for using avatars; however, more studies are needed to assess the effects and impact of the application of digital humans for teaching and learning.
 

Educational videos

Educational videos refer to visual content designed for teaching and learning and have been widely used in higher education for various academic purposes (McNulty and Lazarevic, 2017). Video-based teaching and learning may increase learners' engagement and motivation and adapt to diverse preferences. There are multiple ways to make educational videos, such as using existing video clips or media resources. One of the common approaches is that an instructor may write a script first and then record the content via devices like software or other equipment. The process could be time-consuming, and learning the varied recording and editing software could be stressful for some teachers. By incorporating AI technologies such as synthetic media, the process of creating videos can be simplified, making video creation more accessible to educational practitioners.
 

Examples_ videos created with the assistance of D_ID and XIPU AI 

D-ID (https://www.d-id.com/) is a platform that merges two AI generative tools for generating video and speech. It enables users to create AI-generated synthetic videos, transforming a digital human into an AI-talking presenter and converting text into natural spoken audio. AI-generated avatars are featured with nuanced facial expressions, head movements, hand gestures, and lip-syncing that match the audio. On the other hand, XIPU AI, based on the advanced large language models, the GPT-4 turbo, is utilised to support the creation of video scripts and the design of follow-up activities. The three examples illustrated below are based on a selected weekly topic, "Digital Transformation and Cybersecurity", aimed at developing classroom teaching materials for an EAP business curriculum. 
 

Example 1

Teachers may often find helpful videos online, such as YouTube, to introduce new concepts, ideas, and backgrounds to their classrooms. However, the selected videos sometimes may not precisely match the learning goals. Synthetic media provides a solution that allows educators to make videos by reinventing and recreating the selected content from existing videos. The first example showcases how AI technologies facilitate the creation of class materials from an existing video focusing on practising listening skills.  
 

Activity: Listening skills

• The instructions provided for students are shown below:
Scenario: Imagine you are the CEO of a local company and are aware that having a digital transformation for your enterprise may gain lasting competitive advantages in your market. You will watch a video that includes two parts introducing digital transformation to gather more information and complete two follow-up activities. (Please note: The videos were created with AI assistance. The presenter is an AI-generated speaker.)
 
• The steps of creating the video for the listening activities are described below:
 
Step 1: Prepare a video script 
The script was adapted from a YouTube video, "What is Digital Transformation" (2021) by Simplilearn. The author abridged and separated the original transcript into two parts for two follow-up listening activities: Part 1, Fill-in-the-blank and Part 2, True or False.
 
Step 2: Generate the videos using the D_ID platform
The two parts of the script were generated into two videos and presented by an AI-generated avatar by using the D_ID shown below, which provides options including a gallery of existing presenters, type of voice such as expressive, depreciating or calm, type of language, and accents, gender, background, title as well as sub-titles. 
 
 
 
Step 3: Design follow-up activities using XIPU AI GPT 4 Turbo
Part 1, Fill-in-the-blank, focused on listening for details; hence, the original transcript was directly used to check students' understanding of the specific details from the talk. For Part 2, 'True or False' on comprehension questions, the author asked XIPU AI to summarise the main ideas of each paragraph of the transcript by using the prompt "Summarise this passage: (the transcript) …."; then, the author used the summaries to design the questions. If asking XIPU AI directly with the prompt "Design a True or False quiz based on this passage: (the transcript) …," some responses might not be appropriate due to their simplicity and logical issues, for example: "The acceleration of digital transformation is primarily due to technology evolving at a slow pace." 
 

Example 2 

EAP teachers often adopt role-play activities for students to practice their speaking skills and to prepare them for real-life situations. However, role-play could be daunting or embarrassing for some students. Two examples of videos were made with two AI-generated avatars as student presenters to support and motivate students for this activity. The visual content may increase students' engagement and extend their learning experience in the classroom.
 

Activity: Presentation_ a role play

• The instructions provided for students are as follows:
Scenario: As the CEO, you share your ideas with the board members after listening to the talk. Prepare and deliver a one-minute talk (in groups).
 
• The steps of creating the two example videos are described below:
 
Step 1 Prepare video scripts using XIPU AI GPT 4 Turbo
The author first created a prompt on XIPU AI that read: "Write a short script about one minute: You are the CEO of a local company and think your company should have a digital transformation. You share your ideas with the board members." However, XIPU AI did not generate responses. The author created another prompt on XIPU AI: "The reasons for a company to have a digital transformation." XIPU AI listed six reasons. Then, the author created a follow-up prompt: "Rewrite the above as a script." XIPU AI generated a script. Based on the XIPU AI's script, the author edited and redesigned it to make the scripts for the two videos. 
 
Step 2 Generate the videos using the D_ID platform
The two scripts were generated into two videos using D_ID; each example had a talking AI-generated presenter. The two avatars are shown below: one male and another female student. The background was set in a classroom chosen from the gallery, and each presenter's main topics were also shown in the background.
 
 

Example 3

Scenario-based learning means creating a realistic or real-life situation beyond the classroom. It adopts active learning strategies involving students practising problem-solving and critical thinking skills by employing their background or newly learnt knowledge (Morgan, 2024). This example showcases an AI-generated avatar as a narrator of a story about a cyberattack by using the D_ID. Students are required to analyse the situation presented and provide solutions to the problems cooperatively in groups.
 

Activity: Group discussion

• The instructions provided for students are as follows:
First, listen to Ann's story and take notes. Then, discuss the following questions in groups:
1. How do you think this happened? 2. Could Ann prevent it? 3. Would similar things happen to companies? (Please note: The story and the discussion questions were adapted from "What is Cyber Security (2022). The videos were created with AI assistance, and the presenter is an AI-generated speaker.)
 
• The steps of creating the video are described below:
Step 1 Generate the videos using the D_ID platform
The script was adapted from the YouTube video "What is Cyber Security" (2022) by Simplilearn. The edited script was generated into a video using D_ID—a female avatar shown below with a depreciating voice based on the story's content.
 
 
Step 2: Prepare the suggested answers for the discussion using XIPU AI GPT 4 Turbo
The author created three prompts on XIPU AI one by one: "Read this story and analyse why this happened (paste the transcript here)", "Read this story and suggest how this could be prevented (paste the transcript here) ", and then "Read this story. Could this happen to companies? (paste the transcript here)." The author edited XIPU AI-generated responses for each question and used them as additional suggested answers for the discussion questions.
 

Discussion and Conclusion

This article shares the utility of two GenAI tools, the D_ID platform and XIPU AI, in producing educational videos with their powerful capabilities of creating videos with avatars and generating ideas, texts and information for various tasks. With the prepared script text, the videos can be made within minutes using synthetic media with AI avatars, significantly enhancing video production efficiency. The various types of avatars can also be applied in multiple scenarios, such as showing diverse real-life conditions and creating a more engaging and active learning setting for students. In addition, with the assistance of XIPU AI based on the LLMs, the generated detailed responses facilitate the process of developing materials and activities for teachers. While GenAI tools promote automation efficiency and productivity, educators are advised to be cautious in integrating GenAI tools into their teaching practice and the development of materials (Law, 2014). Teachers may experiment with different prompts to communicate with XIPU AI to get responses or results that meet expectations and requirements. In addition, scrutinising the AI-generated content for accountability is recommended due to possible inaccuracies (Cai et al., 2023). The adoption of Gen AI tools suggests a new paradigm in developing video-based educational videos to provide dynamic environments for language learning. 
 
 
 
 
 
 

References

Ariyaratne, P. et al. (2023) 'A comparison of ChatGPT-generated articles with human-written articles,' Skeletal Radiology 52, pp.1755-1758. doi: 10.1007/s00256-023-04340-5
Cai, Z.G. et al. (2023) 'Do large language models resemble humans in language use?' Available at: http://arxiv.org/abs/2303.08014 (Accessed: 7 April 2024).
Caltech (No date) What Is Generative AI? Available at: https://scienceexchange.caltech.edu/topics/artificial-intelligence-research/generative-ai (Accessed: 31 July 2024)
Hinman, S. (2023) 'ChatGPT in the classroom: enhancing teaching strategies with artificial intelligence assistance', Delta Kappa Gamma Bulletin, 90(1), pp. 42–44.
Hubert, K.F., Awa, K.N. and Zabelina, D.L. (2024) 'The current state of artificial intelligence generative language models is more creative than humans on divergent thinking tasks,' Scientific Reports 14, 3440. doi: 10.1038/s41598-024-53303-w
Law, L. (2024). 'Application of generative artificial intelligence (GenAI) in language teaching and learning: A scoping literature review,' Computers and Education Open, 6, 100174. doi: 10.1016/j.caeo.2024.100174 
Leiker, D. et al. (2023) 'Generative AI for learning: Investigating the potential of synthetic learning videos' Available at: https://arxiv.org/pdf/2304.03784 (Accessed: 7 April 2024). 
McNulty, A., and Lazarevic, B. (2017). 'Best practices in using video technology to promote second language acquisition,' Teaching English with Technology, 12(3), pp.49-61. Available at: https://files.eric.ed.gov/fulltext/EJ1144964.pdf (Accessed: 4 April 2024).
Morgan, A. (2024) 'Enhancing access in an online course using Universal Design for Learning (UDL) and Scenario-Based Learning (SBL)', TechTrends. doi:10.1007/s11528-024-00981-y 
Seymour, M., Riemer, K., and Kay, J. (2018). 'Actors, avatars and agents: potentials and implications of natural face technology for the creation of realistic visual presence,' Journal of the Association for Information Systems, 19(10), pp.953–981. doi:10.17705/1jais.00515 
Seymour, M. et al. (2021) 'Have we crossed the uncanny valley? understanding affinity, trustworthiness, and preference for realistic digital humans in immersive environments', Journal of the Association for Information Systems, 22(3), pp. 591–617. doi:10.17705/1jais.00674
Vallis, C. et al. (2023) 'Student perceptions of AI-generated avatars in teaching business ethics: we might not be Impressed', Postdigital Science and Education, pp. 1–19. doi:10.1007/s42438-023-00407-7
What Is Cyber Security (2022) YouTube video, added by Simplilearn [Online]. Available at https://www.youtube.com/watch?v=inWWhr5tnEA (Assessed: 6 July 2024).
What is Digital Transformation (2021) added by Simplilearn [Online]. Available at https://www.youtube.com/watch?v=508CR1fd8ws&t=81s (Assessed: 6 July 2024).

AUTHOR
Ashlee Tai
English Language Centre
School of Languages
XJTLU

DATE
11 September 2024

Related Articles