Study design
The TWM-E cluster randomized controlled trial (RCT) was registered with the Australian and New Zealand Clinical Trials Registry (ACTRN12618001008213). Government schools were randomly selected within a 60-km radius from the University of Newcastle (e.g., Hunter, Central Coast, Newcastle regions). Written consent forms were received from school principals, teachers, and parents. Data collection occurred between April and September 2018. The design, implementation and reporting of the TWM-E study complied with the Consolidated Standards of Reporting Trials guidelines for clustered RCTs 36. Detailed study methods are reported elsewhere 37.
Randomisation
Schools were the unit of randomization. After receiving written consent, participating schools were matched by size and demographic characteristics based on the schools’ Index of Community Socio-educational Advantage 38, using a measure of relative advantage/disadvantage based on geographic area in Australia. Schools were randomized into experimental and waitlist control conditions after the baseline assessments using a computer-based algorithm by an independent researcher.
Participants
A total of 283 Grade 3 and 4 primary school students (Mage = 9.81, SD = 0.68) and their teachers (N = 12) who were willing to deliver physically active lessons, were recruited from 9 primary schools (each school contributed one class - apart from one control and one intervention schools which had two classes). Ethics approval was obtained from the University of Newcastle, New South Wales (NSW), Australia (No: H-2017-0240) and the NSW Department of Education (SERAP No: 2017368). The flow diagram of participants is portrayed in Fig. 1.
Power Calculation
Power analysis using procedures appropriate for a RCT study design 39 40 were conducted to determine the sample size required to detect changes in the primary outcome of accelerometer-determined physical activity. Calculations assumed baseline to post-test correlation scores of r = 0.30 and were based on 80% power and alpha level 0.05. Based on the reported physical activity effects (i.e., SD change = 200 counts per minute) after six weeks of the “Thinking While Moving in Maths” study (aka EASY Minds) pilot study and an intra-class correlation coefficient (ICC = 0.15), a study sample of N = 200 with 8 clusters (i.e., schools) of 25 students would provide adequate power to detect a between group difference of 200 counts per minute across the school day 19 40. We initially intended to include Actigraph accelerometers in the study, but due to lack of access to these, we used Axivity instead. Hence, the counts per minute power calculation was not relevant.
Intervention
The TWM-E program supported classroom teachers to adapt their English lessons to include movement-based learning components and to deliver these lessons over a 6-week period (3 × 40 min lessons per week). The recommended lesson content was generated from the NSW K-6 English syllabus 41. Participating teachers received a 1-day professional learning workshop, as well as equipment and resources for the activity components in the lessons (e.g., chalk, buckets, balls, whiteboards, drill ladders, skipping ropes, lettered bean bags, and lettered flexi-domes – value $400 AU), and mentoring of the research team in the project (including 3 face-to-face school visits and observations).
The professional learning workshop provided the rationale for physical activity integration, presented the results of the feasibility trial, and offered practical examples of physical activity integration (i.e., online videos), access to English curriculum expertise and peer-supported planning sessions 37. In particular, in the final component of the workshop, teachers created their own action plans, highlighting the timeline, examples of activities as well as potential barriers and solutions. The professional learning workshop was registered with the National Standards Education Authority (NESA) and teachers were given five professional hours towards their accreditation 42. Its content was developed according to the training model of teachers’ continuing professional development 43.
During the intervention, classroom teachers were responsible for the planning and delivery of all movement-based lessons. They were supported through weekly emails, answering possible questions and suggesting solutions for issues arising. The research team also provided feedback and advice stemming from face-to-face observations of the active English lessons (i.e., 40 minutes). English lessons in both intervention and control groups occurred during the usual timetable slot (e.g., 9:00–11:00 am). The control group followed their usual practice (i.e., normal curricular lessons) for the duration of the study period. Schools in the wait-list control condition received the professional learning workshop at the end of the post-intervention assessments in September 2018.
Measures
Baseline assessments took place in April-June 2018 and the post-intervention assessments in September 2018. All study measures were conducted in the schools by trained research assistants who were blinded to the group allocations at baseline. The same research assistants were used for both time points (baseline and post-intervention). However, it was not possible to blind assessors to treatment allocation at follow-up as the physically active lessons occurred during regular lesson time when data collection took place. Consenting students completed the assessments under exam-like conditions following a verbal explanation from a research assistant. Demographic information (i.e., age, sex, language spoken at home) was collected via a student questionnaire at baseline.
Primary outcome: Physical activity during the school day was measured using tri-axial wrist-worn accelerometers AX3 (Axivity, York, UK). Wrist-worn Axivity monitors have been found to have high equivalence and agreement regarding acceleration, sedentary, light and moderate-to-vigorous intensity of physical activity in adults compared to GENEActiv and Actigraph GT9X 44. Accelerometers were worn for five consecutive school days (i.e., Monday to Friday) from 9:00 am to 3:00 pm. Data were downloaded in raw format using the OmiGui Software and processed in R software (http://cran.r-project.org/) using the software package GGIR 45. Non-wear time was classified within a 60 min time window if for at least two out of the three axes, the standard deviation was less than 13 mg and the value range is less than 50 mg 46. Data were reduced by calculating the average gravity-based acceleration units (g) per 1-s epoch, with daily time spent in moderate-to-vigorous physical activity (MVPA) determined using the sum of epochs averaging above 201 mg 47. The average minutes spent in MVPA per day and average daily wear time were computed using data from each participant’s valid days. Valid days were defined as more than five school hours on any given day 48, for at least 3 days 49.
Secondary outcomes: On-task behavior during English lessons was estimated as a percentage of time using a momentary time sampling adapted by Riley and colleagues 50 from the “Behaviour Observation of Students in Schools” and the “Applied Behaviour Analysis for Teachers” 51 52. On-task behavior is categorized as “active engagement”, defined as the time a child is actively engaged in an academic activity such as reading, writing, or performing the designated task), or “passive engagement” such as sitting quietly, sitting quietly absorbing the information but not actively engaged in the activity. Off-task behavior is defined as behavior that is not associated with the task, and classified as off-task motor such as walking around the class, off-task verbal, such as chatting, or off-task passive such as looking around in the class 19 53.
Using a random number-producing algorithm, 12 students per class (6 males, 6 females) were randomly selected based on the alphabetical class roll. Observations occurred at both time points (baseline and post-test) by two trained research assistants. Observations were conducted at the end of 15-sec intervals for 30-min in the allotted English time slot (i.e., 9:00–11:00 am), with each student’s behavior coded as on-task (actively engaged or passively engaged) or off-task (off-task verbal, off-task motor or off-task passive) at the time. At the end of the following 15-sec interval, the next student’s behavior was coded. Observers listened to an audio file via headphones, which informed them when to observe and record by circling an appropriate code (i.e., actively engaged, passively engaged or off-task) using an observation sheet. This process was repeated until each of the six students were observed 20 times.
During the actual study period, students were aware of the presence of the research team in the class, but did not know the purpose of their visit. Observers stood at the back of the classroom in order to minimise their influence the student attentiveness. We did not establish an interrater reliability for this study. Instead, we sought to assess the maximum number of students in each class. However, our research team has previously established an intraclass correlation coefficient of 0.84 for the same on-task behavior assessments (Mavilidi et al., under review).
Literacy attainment was measured using the standardized “Progressive Achievement Test”, following the Australian Council for Education Research recommendations 54. Children were assessed on written spelling (30 items) and grammar and punctuation (35 items). The test was administered by the regular classroom teacher and children were given a maximum time of 30 minutes for each assessment.
Executive functioning was measured using validated tests from the National Institute of Health Toolbox for 7–17 years 55 56 delivered on tablet devices. The flanker task examines inhibitory control ability. Participants are asked to respond whether the central arrow of a multi-arrow display is pointing left or right, using index fingers of left/right hand. The flanking arrows are either congruent (i.e., pointing in the same direction as the central arrow, →→→→→), or incongruent (i.e., pointing in the opposite direction to the central arrow, (→→←→→). Children completed four practice and twenty test trials, with the test lasting approximately 3–5 minutes. More accurate and faster responses are produced for congruent than incongruent trials 57 58.
The dimensional change card sort test examines set-shifting ability (i.e., the ability to switch between different sorting rules). Participants have to sort pictures according to one of two dimensions (e.g., shape and colour), and use explicit cues (the words ‘shape’ or ‘colour’) to shift between sorting rules on successive trials. Children completed three practice trials and test duration was approximately 4–6 min. Both tests were scored based on children’s accuracy and reaction time. When accuracy levels were less than 80%, the accuracy and reaction time were combined. For scores higher than 80%, the final score was equal to the accuracy score 59. Higher scores indicate better performance.
Process Evaluation
The feasibility, adherence and satisfaction of the TWM-E program was assessed through:
- Post professional learning workshop questionnaire: Teachers responded on a 5-point Likert scale, ranging from 1 (strongly disagree) to 5 (strongly agree), regarding their perception of the skills acquired from the training, the satisfaction and quality of the training, and their confidence to deliver movement-based English lessons.
- Fidelity (session quality): in Weeks 2, 4, and 6 active English lessons were observed by the research team and assessed on developing English concepts (3 items; “Movements aided and promoted learning”), physical activity levels (3 items; “Equipment used promoted physical activity”), and students’ engagement (3 items; e.g., “Students were engaged by the activities taught”) using a 5-point Likert scale ranging from 1 (Not at all true) to 5 (Very true) .
- Post-program student satisfaction: students responded regarding their perceptions of physically active English lessons using a 9-item questionnaire, with a 5-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree).
Statistical Analyses
Statistical analyses were conducted using IBM SPSS (version 24) and alpha level was set at p < 0.05. The outcomes were analyzed using linear mixed models, which are (i) consistent with the intention-to-treat principle, (ii) robust to the biases of missing data and (iii) provide appropriate balance of Type 1 and Type 2 errors 60 61. Considering the hierarchical structure of the data (e.g., students nested within classes and schools), multilevel modelling analyses were used to analyse all outcomes 62 63. More specifically, the models were adjusted for the clustering at class level. In the current study, school-level clustering was negligible after accounting for clustering at the class level, also supported by previous research 64. The results focus on the group-by-time effects, i.e., the interaction between Group (TWM-E, control) and Time (post-test, baseline).
Overview
A summary of the demographic characteristics is presented in Table 1. The majority of the participants were from an Australian cultural background (94.6%) and spoke English at home (97.3%).
Table 1
Summary of demographic characteristics.
Characteristics | Control (n = 162) | TWM-E (n = 121) | Total (n = 283) |
Age (years), mean (SD) | 9.80 (0.68) | 9.81 (0.68) | 9.81 (0.68) |
Sex, n (%) | | | |
Male | 83 (51.2) | 63 (52.1) | 146 (51.6) |
Female | 79 (48.8) | 58 (47.9) | 137 (48.4) |
Cultural background, n (%) | | | |
Australian | 139 (94.6) | 89 (84.8) | 228 (90.5) |
European | 5 (3.4) | 1 (1.0) | 6 (2.4) |
Asian | 1 (0.7) | 3 (2.9) | 4 (1.6) |
Other | 1 (1.4) | 12 (11.5) | 14 (5.6) |
Language spoken at home, n (%) | | | |
English | 145 (97.3) | 100 (95.2) | 245 (96.5) |
Other | 4 (2.7) | 5 (4.8) | 9 ( 3.5) |
Aboriginal or Torres Strait Islander, n (%) | | | |
No | 132 (89.2) | 88 (87.1) | 220 (88.4) |
Yes | 16 (10.8) | 13 (12.9) | 29 (11.6) |