1. General procedures
1.1 Subjects
A total of 21 young adult volunteers (age [mean ± standard deviation], 21.4 ± 1.4 years] participated in this study. All subjects underwent complete ophthalmologic examinations, including determination of the ocular dominance using the hole-in-the-card test, best-corrected visual acuity at a distance (5.0 m), near the point of convergence, stereoscopic acuity at 40 cm (Titmus Stereotest; Stereo Optical Co., Inc., Chicago, IL, USA), heterophoria by the alternating cover test at near (33 cm) and at distance (5.0 m) assessments, and fundus examinations. Stereoacuity was converted into the logarithm of the arc second (log arcsec).
Table 1 presents the characteristics of the subjects. The mean ± standard deviation of the refractive errors (spherical equivalents) of the dominant eye was −3.23 ± 3.00 D and that of the nondominant eye was −3.08 ± 2.80 D. The best-corrected visual acuity was 0.0 logMAR units or better in all subjects. The average heterophoria was −6.3 ± 5.9 prism diopters (PDs) at distance and −10.9 ± 8.8 PDs at near. All healthy volunteers had a stereoacuity of 1.62 ± 0.05 log arcsec (range, 40–60 s).
After we explained the nature of the study and possible complications to the subjects, all subjects provided informed consent. This investigation adhered to the World Medical Association Declaration of Helsinki tenets. The Institutional Review Board of Teikyo University approved the experimental protocol and consent procedures (approval No. 19–224-2).
1.2 Apparatus
In this study, we used the VOG-SSD system developed by Hirota et al..17 We recorded eye movements while tracking the target using a VOG (EMR-9, NAC Image Technology Inc., Tokyo, Japan). The VOG device determined the eye positions by detecting the corneal reflex and pupil center that were created by the reflection of near-infrared light with a sampling rate of 240 Hz. The measurement error (interquartile range) was 0.2°–0.5° at a distance of 1.0 m. The scene camera recorded the real scenes (resolution, 640 × 480 pixels; angle of view, ±31° from the center of the scene camera) with a sampling rate of 29.97 Hz. The gaze positions were merged with the real scenes at a delay of ≤52 ms.
Before performing the eye movement test, all subjects underwent a calibration test to adjust the position of their gaze on the images of the scene camera and under binocular conditions with fully corrected glasses. All subjects were asked to fixate nine red cross targets (visual angle, 0.1°) on a white calibration plate during calibration. From one to nine, the nine red crosses of the targets were set at the following parameters: (center: horizontal of 0.0°, vertical of 0.0°), (left: −20.0°, 0.0°), (right: 0.0°, 20.0°), (upper left: −20.0°, 20.0°), (upper right: 20.0°, 20.0°), (lower left: −20.0°, −20.0°), (lower right: 20.0°, −20.0°), (upper: 20.0°, 0.0°), and (lower: 0.0°, −20.0°) respectively. The center of the calibration plate was defined as 0°, the right and upper halves of the screen were defined as the positive sides, and the left and lower halves were defined as the negative sides.
The object detection algorithm was used for the SSD model that was the same as in Hirota et al.17,18, which detected the target of rabbit-like character with a 75% average precision of 99.7% ± 0.6%.
We used Python 3.8.5 for Windows 10 (Microsoft, Redmond, WA, USA) with the following libraries: Matplotlib 3.3.2, Numpy 1.18.5, OpenCV 3.3.1, Pandas 1.1.3, Pytorch 1.6.0, Scikit-learn 0.23.2, and Seaborn 0.11.0.
1.3 Nine-direction eye movements testing
The target was a rabbit-like character that had already been learned to the SSD in Hirota et al..17 The target size was 10 × 10 cm, which subtended a visual angle of 5.7° at 1.0 m. The target was manually moved to nine directions (center, left, right, upper left, upper right, lower left, lower right, upper, and lower) within ±15° randomly by an examiner.
All subjects were seated in a well-lit room (600 lx) wearing fully corrective spectacles. Each subject's head was stabilized with a chin rest and forehead rest. During the eye movement test, the subjects were asked to fixate on the nose of the target, the visual angle of which was 0.1° at 1.0 m.
1.4 Filtering for both eye positions
We excluded VOG data when the change in pupil diameter was >2 mm/frame due to blinking.20 We replaced the percentage of missing values (0.4% ± 0.7% for all subjects) with a linearly interpolated value calculated from an algorithm written with Python 3.8.5. The horizontal and vertical eye movements were analyzed, and the SPEM and saccadic eye movements were identified using a velocity-threshold identification (I-VT) filter.21 The I-VT filter was used to classify eye movements on the basis of the velocity of the directional shifts of the eye. A saccadic eye movement was defined as the median velocity of three consecutive windows >100°/s. Then, the eye position data at 240 Hz were synchronized with the target data at 29.97 Hz.
2. Experiment 1
Eye movement testing involves moving the target in eight directions: left, right, upper left, upper right, lower left, lower right, upper, and lower. There is a need for an algorithm that can identify the direction in which the examiner moves the target manually in the clinic without the input of a trigger. In experiment 1, we compared the accuracy of the classification in each direction of target presentation between the peak fitting–based detection algorithm and the conventional threshold-based detection algorithm.
2.1 Procedures
In clinical practice, the origin of the scene camera (horizontal of 0.0°, vertical of 0.0°) and the position where the target is initially presented by the examiner do not necessarily coincide (Fig. 1A, B). The median of the target location of the target was calculated both horizontally and vertically, respectively, and was defined as the relative origin. The target location and both eye positions were corrected for the difference from the relative origin (Fig. 1C).
The target location calculated using the SSD was identified more than 99% of the time and was more stable than eye positions affected by blinks and tears. Thus, each direction was identified using the location of the target as a cue.
2.2 Algorithm of automatic detection for testing the directions of eye movements
2.2.1 Peak fitting–based detection
The target location was converted to the position vector, and then, the maximum and minimum peaks were detected for 3.0 s (Fig. 2A, B). We separated the data between the two minimum peaks, including one maximum peak. The separated data were decomposed into horizontal and vertical components from the position vector (Fig. 2C, D). After excluding 1 s from both ends of the separated data, the medians of the horizontal and vertical target locations were calculated (Fig. 2E, F).
The eight median horizontal and vertical locations were ranked from maximum to minimum at left, right, upper, and lower, and then the top three values in four directions were grouped (Fig. 3A). The upper left, upper right, lower left, and lower right were identified by combining the horizontal and vertical directions (Fig. 3B). The remaining data in each group were the left, right, upper, and lower.
2.2.2 Threshold-based detection
Threshold-based detection is a simple approach for identifying the category. In this study, the target data were separated to left (horizontal location ≤ −2.0° and −2.0° ≤ vertical location ≤ +2.0°), right (+2.0 ≤ horizontal location and −2.0° ≤ vertical location ≤ +2.0°), upper left (horizontal location ≤ −2.0° and +2.0° ≤ vertical location), upper right (+2.0 ≤ horizontal location and +2.0° ≤ vertical location), lower left (horizontal location ≤ −2.0° and vertical location ≤ −2.0°), lower right (+2.0 ≤ horizontal location and vertical location ≤ −2.0°), upper (−2.0 ≤ horizontal location ≤ +2.0° and +2.0° ≤ vertical location), and lower (−2.0 ≤ horizontal location ≤ +2.0° and vertical location ≤ −2.0°). The cutoff value was defined as the minimum value calculated using the averaged degree of the mean −2.0 standard deviation in all directions using the data from subject 1 to subject 5.
2.3 Statistical analysis
We evaluated the accuracy of the classification in each direction between the peak fitting–based and threshold-based detection using Fisher's exact test.
SPSS version 26 (IBM Corp., Armonk, NY, USA) was used to determine the significance of the differences, and a P value of <0.05 was considered to be statistically significant.
2.4 Results
The accuracy of the classification in each direction was significantly higher using the peak fitting–based detection (correct, 100.0%; incorrect, 0.0%) than using the threshold-based detection (correct, 47.8%; incorrect, 52.2%; P < 0.001; Table 2).
The finding of experiment 1 suggested that the algorithm of the peak fitting–based detection was suitable for evaluating eye movement testing.
3. Experiment 2
In experiment 2, we investigated the algorithm for the automatic calculation of latency and gain, which are evaluation indices of the eye movements using the data obtained by the peak fitting–based detection algorithm.
3.1 Calculating for latency and gain
All directions of the horizontal and vertical target location and both eye positions were converted to the position vector. The raw data were fitted with a cubic function and were detected at each peak time (Fig. 4A, B). Then, each peak time was applied to the raw data (Fig. 4C). The latencies of both eyes were defined as the difference between the peak time in both eyes and that in the target location.
The target location and both eye positions at the peak time were defined as maximum values. We explored the 25th and 75th percentile points of the maximum values in the centrifugal direction (Fig. 5). We then created a linear regression line using the target location and both eye positions between the 25th and 75th percentile points of the maximum values. The gains of both eyes were defined as the ratio of the slope of the regression line in both eyes to the slope of the regression line in the target between the 25th and 75th percentile points.
3.2 Statistical analysis
We determined the differences in the latencies and gains within both eyes in each direction using the Schéffe test. We calculated the differences in the latencies and gains between both eyes in each direction using the Wilcoxon signed-rank test. The Bonferroni method was used to adjust the P values.
To determine the significance of the differences, we used SPSS version 26 (IBM Corp., Armonk, NY, USA), and a P value of <0.05 was considered to be statistically significant.
3.3 Results
The latencies in all directions were not significantly different within both eyes (left eye, P > 0.150; right eye, P > 0.68; Fig. 6A, B; Table 3). The latencies in all directions were not significantly different between left (138.04 ± 89.36 ms in all directions) and right (144.75 ± 97.78 ms in all directions) eyes (P > 0.552; Fig. 6C; Table 3).
The gains in all directions were not significantly different within both eyes (left eye, P > 0.75; right eye, P > 0.50; Fig. 7A, B; Table 3). The gains in all directions were not significantly different between left (0.943 ± 0.149 in all directions) and right (0.935 ± 0.133 in all directions) eyes (P > 0.99; Fig. 7C; Table 4).
The findings of experiment 2 suggest that using the algorithm of peak fitting–based detection, the eye movements can be evaluated from the data with the identified target direction.