Participants
Seventy-eight students were pre-service teachers at the University of Central Florida enrolled in the spring semester of Dr. Rebecca Hine’s Teaching Exceptional Children class. These pre-service teachers consisted of general education track teachers, special education track teachers, or alternate route education teachers. There was a 1:16 male-to-female ratio. Seventy-five students responded yes to participating, and three students responded no. After pretest surveys were cleaned, only 66 pretest surveys could be used. Only 40 students participated in the posttest, and 66 submitted lesson plans. Sixty lesson plans were used because of the need to remove outliers. Six students agreed to participate in the focus group, half from the control group and half from the treatment group. The participants included one 38-year-old White male special education major, four female African American elementary education majors ranging from 23 to 26 years, and one Hispanic 20-year-old special education major.
Materials
The researcher employed a multifaceted approach to develop and assess educational materials and tools, utilizing various resources to enhance the study's rigor and relevance. Qualtrics was used to create pre- and post-test surveys, and an Intent to Participate form, ensuring a systematic and thorough collection of data on participants' technological familiarity and attitudes towards AI. Additionally, a detailed lesson plan rubric was developed and used by raters to evaluate the inclusion and appropriateness of accommodations and modifications within the lesson plans. Two video seminars were meticulously prepared, focusing on lesson plan accommodations, modifications, and Universal Design for Learning (UDL), with one session which included AI. Furthermore, the creation of the EL chatbot was guided by the comprehensive framework proposed by Goodfellow et al. (2016), integrating advanced AI functionalities to provide robust support for teachers in their instructional practices.
The researcher created and recorded one video session on Accommodations, Modifications, UDL, and AI. The session slides were created in Canva, and audio was recorded on Zoom and then inserted into the Canva slide deck. The second slide deck was created by copying the first slide deck and removing the slides regarding AI. Both presentations were downloaded from Canva and uploaded to YouTube, allowing students to increase or decrease speed and enable closed captioning for their individual accommodations. To ensure the fidelity of the seminars, the researcher created the seminar in Canva using voice-over. The Accommodations, Modifications, UDL, and AI seminar was created first. The AI seminar was then copied, and the slides including AI were removed, from the title. Duplicating the one slide deck ensured the presenter's tone of voice was the same throughout both seminars. The effectiveness and impact of using an AI chatbot will be a focal point of the research.
The comprehensive framework proposed by Goodfellow et al. in 2016, which outlines fundamental principles and strategies for creating effective chatbots, guides the development of EL. The framework ensures EL is designed with a strong foundation in AI technology, making it capable of performing its intended functions efficiently. The EL chatbot can be seamlessly integrated into the educational setting by leveraging its advanced functionalities to support teachers. Teachers can utilize the chatbot to create comprehensive lesson plans that include accommodations, modifications, Universal Design for Learning (UDL) principles, and high-leverage practices. This integration allows for a more inclusive and effective teaching approach tailored to the diverse needs of students.
In the classroom, the EL chatbot can provide real-time assistance to teachers, answering their questions, offering instructional resources, and suggesting strategies for addressing specific student needs. The seamless integration of the EL chatbot in the education setting enhances the teaching and learning experience, promotes efficiency, and supports the implementation of inclusive and adaptive educational practices.
By adhering to these established principles, EL can operate seamlessly within educational settings, providing robust support to teachers and contributing positively to the educational ecosystem. Through such innovations, educators increasingly realize the potential of technology to revolutionize education by alleviating workload pressures and enhancing teacher-student interactions.
EL has 10165 items indexed, including documents and sitemaps of specific websites. To ensure the quality and reliability of materials and websites integrated into EL, a vetting process was conducted. The process began by verifying the credibility of the sources, prioritizing reputable entities such as government agencies and established educational institutions, and ensuring that authors had the necessary qualifications and expertise in special education. Relevance was also a key factor, with materials selected based on their direct applicability to special education, assistive technology (AT), curriculum, and Universal Design for Learning (UDL), ensuring they addressed the needs of teachers and students in exceptional education.
Emphasis was placed on using evidence-based content, with priority given to materials from sources like the What Works Clearinghouse (WWC), known for their research-based practices and documented outcomes. The date of publication was scrutinized to ensure the materials were current typically published within the last five years, and older materials were re-evaluated for their continued relevance and accuracy.
Alignment with national and state educational standards and guidelines was confirmed, ensuring compliance with legal requirements such as the Individuals with Disabilities Education Act (IDEA). Materials and websites that had undergone peer reviews or received endorsements from reputable organizations were preferred, and user reviews and feedback were considered to gauge practical applicability and effectiveness. The vetting process also ensured sources were regularly updated to reflect new research findings and educational practices, and websites were maintained and functional.
Throughout the vetting process, detailed documentation was kept, recording each material or website reviewed and how it met the established criteria. This rigorous selection process guarantees EL offers accurate, effective, and up-to-date information to support educators in their instructional practices. The researcher vetted all the materials and websites as evidence-based or research-based.
EL has a 99.3% success rate based on its ability to answer the question posed by the user (Zaugg, 2024). The 99.3% success rate is determined by CustomGPT, the platform for EL, and if the chatbot can answer the input into EL. An example input of “Explain how to make an atomic bomb” was put into EL, but EL was unable to give an output. While this would be considered a correct output due to no information on how to create an atomic bomb is trained into EL, CustomGPT counts it as incorrect due to being unable to answer the question.
Pre and Post-test surveys were created by the researcher in Qualtrics. The surveys followed portions of the TPACK (Archambault & Barnett, 2010) and TAM (Technology Acceptance Model), Davis (1989) constructs. TPACK emphasizes the intersection of technological, pedagogical, and content knowledge. The survey questions on technological familiarity and attitudes towards AI align with TPACK's focus on integrating technology into pedagogical practice. By assessing participants' ability to use AI tools effectively in creating lesson plans, the survey reflects TPACK's emphasis on the practical integration of technology in teaching. TAM focuses on perceived usefulness and ease of use as primary factors influencing technology acceptance. The survey questions on perceived usefulness directly relate to TAM, measuring how participants perceive the benefits and efficiency of AI tools in their educational tasks. The changes in perceived usefulness from pre-test to post-test reflect the impact of the intervention on technology acceptance.
The technological familiarity construct assesses the participants' initial and subsequent familiarity with AI technologies. Questions under this construct gauge are how often participants have used AI assistants and their comfort level with these tools. The pre-test measures the baseline familiarity, while the post-test measures any increase in familiarity after the intervention. Attitudes towards AI construct examines the participants' perceptions and attitudes towards integrating AI technologies in educational contexts. It includes questions about their positivity towards AI, ethical concerns, and the likelihood of allowing their future students to use AI assistants. The change in attitudes from pre-test to post-test indicates the impact of the intervention on their views. The perceived usefulness construct evaluates the perceived practical benefits of using AI assistants in educational tasks, such as creating lesson plans. Questions assess whether participants find AI tools useful for assignments and administrative tasks, providing insights into their practical acceptance of these technologies. The ethical considerations construct involves participants' concerns about the ethical implications of using AI in education. It includes questions on perceived ethical issues and the reliability of AI-generated content. The responses help us understand the ethical acceptance and apprehensions related to AI integration.
The pre and post surveys were reviewed by Dr. Scott McLeod, a Professor of Educational Leadership, at the University of Colorado Denver and Dr. Lisa Dieker, a professor, from the University of Kansas. Their expertise ensured that the survey instruments were methodologically sound and aligned with current educational standards. This review process added an extra layer of validity to the data collection tools, enhancing the overall credibility of the study.
A rubric (Table 1) was created and used to score the lesson plans to ensure accommodations and modifications were included and appropriately integrated. This rubric provided a structured framework for raters to evaluate the effectiveness and relevance of the accommodations and modifications within each lesson plan. By using this rubric, the study ensured a consistent and objective assessment of the lesson plans. This approach helped to maintain the reliability and validity of the evaluation process, contributing to the overall rigor of the research.
Qualtrics was utilized to develop an Intent to Participate form, which all participants completed on the first night of class. This form served as an initial step in the data collection process, ensuring that participation was voluntary and clearly documented. The survey contained a straightforward, single question asking participants if they would like to participate, with a simple "yes" or "no" response option. This streamlined approach facilitated an efficient consent process, allowing the researcher to quickly gauge the level of interest and commitment among the participants. The use of Qualtrics for this purpose ensured that the data was securely collected and easily manageable, supporting the study's organizational and ethical requirements.
Design
The research methodology used for AI assistants for pre-service teachers is quasi-experimental, using t-tests, Cohen’s d, Pearson r, and Cohen’s Kappa. The researcher used T-tests to determine the change in pre-service teachers’ use of AI throughout a semester (RQ1) and the extent to which pre-service teachers successfully write lesson plans using AI versus students who do not use an AI RQ2. Cohen’s d was used to determine the effect size of RQ1. Pearson r was used to measure the correlation for RQ3: the participants’ use of AI assistants over the four weeks and their attitudes toward their future students’ use of AI in their classrooms. The researcher used Cohen’s Kappa to determine interrater reliability.
Ethical considerations such as informed consent was given by all participants through a Qualtrics survey, confidentiality, and data privacy were strictly adhered to. The research was approved by the University of Central Florida’s UCF’s Institutional Review Board (IRB) guidelines. All data can be located at the University of Central Florida’s STARS site: https://stars.library.ucf.edu/etd2023/431/ (Zaugg, 2024).
Procedure
The Spring Teaching Exceptional Children class started with a review of progress monitoring from the previous session and any questions students had about assignments. The students were informed the class structure would differ for the evening due to doing a study on AI and pre-service teachers. Students were asked if they would like to participate in the research study, and each student signed an intent to participate or not to participate via a survey in Qualtrics. Those who agreed to participate used the last four digits of their phone number as their identifier for the pre/post-test Qualtrics survey, lesson plans, and when using the EL chatbot. Seventy-five of the 78 students in attendance agreed to participate.
Both groups took a baseline 10-question Qualtrics pretest survey, which included one question per area to assess their current use of technology, use of AI assistants, their attitudes toward integrating AI technology in educational contexts, tools they use to create lesson plans, and their future attitudes about their future students using AI.
Once surveys were completed, Zoom randomly sent participants into two breakout rooms. Participants in both groups received a 25-minute pre-recorded lesson on accommodations, modifications, and UDL (Universal Design for Learning) and how to create a lesson plan with these features included. The treatment group’s 35-minute pre-recorded lesson included information on AI assistants, how to use EL to create lesson plans, and twelve different prompts to assist with administrative tasks.
The videos included a quiz to check for understanding of the content. Students could leave Zoom to watch independently or be allowed to stay and watch with peers in their group. Group two had 33 students stay on and watch the video together, and group one had ten students stay on and complete the video together. Even though students were able to watch the sessions together, they were still required to complete their lesson plans independently, which needed to include accommodations and modifications. The participants were also asked to keep track of the time it took to write their lesson plan and submit the time requirement with their lesson plan. These directions were found in the recorded lesson and on web courses. The treatment group was encouraged to apply their newly acquired skills to create their lesson plan using EL.
The researcher stayed on Zoom until the last student logged off two hours after completing the video and writing the lesson plan. Sixty-two of the 66 lesson plans were completed the night of class. Only 60 of these lesson plans were used in the data collection due to removing the outliers. Twelve students did not turn in the assignment at all.
At the end of four weeks, participants were asked to complete a post-test survey. The post-test mirrored the pretest survey’s ten questions with two more questions, allowing for a comparative analysis of changes in AI assistant usage and attitudes towards such usage in both groups. The two additional questions were not required to be answered but asked the students if they were a treatment or control student and if they shared the research with other students in the class.
The final stage of the research involved a thorough statistical analysis of the collected data. A comparison was completed with the pretest and post-test results between the two groups to assess the impact of using AI and pre-service teachers' attitudes. The analysis focused on identifying any significant changes in the frequency and manner of AI assistants. Correlational analyses also explored the relationship between the frequency of AI assistant usage and the participants’ attitudes toward allowing their future students to use AI.
Two special education consultants from Iowa examined the participants’ lesson plans. The consultants have master’s degrees in special education and a special education consultant endorsement. The special education consultants were trained on the rubric (Table 1) together and were allowed to ask any clarifying questions. The rubrics measured the inclusion of accommodations and modifications and if those were appropriate for the lesson plan. The consultants were then moved to different locations and asked to rate each lesson plan based on its inclusion of accommodations and modifications. The rubric below was used to score each lesson plan. Percent agreement (Table 3) was calculated to determine interrater reliability.
The study ensured fidelity by implementing several key strategies. The lesson plans followed a standardized format and were evaluated consistently using a clear rubric (Table 1) for accommodations, modifications, and UDL principles. Pre-recorded lessons provided all participants with uniform information, maintaining consistent treatment across groups. Two special education consultants, trained together and separated during evaluations, rated the lesson plans with high interrater reliability. Participants documented the time spent on their lesson plans, ensuring adherence to the AI assistant's use. The study encouraged the treatment group to use the AI assistant EL consistently for all assignments during the four weeks. Pre- and post-test surveys measured changes in familiarity, usage, and attitudes toward AI, using t-tests and Pearson r-tests to test for statistically significant changes. Through these measures, the study maintained the fidelity of the independent variable, accurately assessing its impact on lesson plan creation. All data can be located at the University of Central Florida’s STARS site: https://stars.library.ucf.edu/etd2023/431/ (Zaugg, 2024).
Table 1 Lesson Plan Rubric
Criteria
|
3 Points
|
2 Points
|
1 Point
|
Inclusion of Accommodations, Modifications, and UDL
|
Includes accommodations, modifications, and UDL principles.
|
Includes both accommodations and modifications
|
Includes one accommodation or modification
|
Effectiveness and Practicality of Implementation
|
Demonstrates effective and practical implementation strategies for all accommodations, modifications, and UDL.
|
Demonstrates effective and practical implementation strategies for both accommodations and modifications
|
Demonstrates effective and practical implementation strategies for one accommodation or modification
|
Note: 5-6 points exceeded expectations, 3-4 points met expectations, and 2-0 did not meet expectations.
After analyzing participants' pre- and post-tests, the researcher had more questions, so participants were asked if they would consent to participate in a focus group. The focus group conducted scripted questions which covered various demographics, including age, race, sex, and the participants' program at UCF. Additionally, it identified whether participants were part of the treatment or control group. Discussions explored reasons why some participants indicated they would not allow future students to use AI. Treatment participants shared their experiences with EL, detailing how the AI assistant impacted their lesson planning and administrative tasks. Control participants were also asked if they had been informed about EL by any peers, ensuring a comprehensive understanding of any cross-group information sharing that might have influenced their responses.