We use two different subsections to cover facial PAD and iris PAD. First we explain facial PAD using recent literature concerning eye movement, and then we review iris PAD using eye movement and explore the relationship between these two topics.
3.1. PAD for facial recognition using eye movement
In [16] pupil movement is used to show whether an eye is real or artificial. Landmarks of the face and points around the eye are used to measure the distance between the eye and the pupil. Variations, minimum and maximum distances are extracted and then used to detecting PAs. This liveness detection method is used for protecting facial biometric systems. A 3D mask, 2D photo and displayed photo on a tablet are each used for PA. Eighty subjects participated in the data collection. The false negative rate (FNR) and false positive rate (FPR) were 10% and 8% respectively. The authors concluded that 3D mask attacks are more difficult to detect than other attacks.
Using iris movement for liveness detection is an idea to offer a secure PAD method. In [17] a method based on iris movement is proposed, called IriTrack. The eye should follow a randomly generated poly-line. In [17], three challenges were studied. First, existing noise in eye movements by user-device interaction; second, the difference in image transformation when images are captured; and third, the pattern has complex interaction that can reduce the number of successful attempts by attackers. They draw a pattern randomly on screen and the user should follow it. It works in a similar way to commonplace unlock pattern systems on mobile devices. However, in IriTrack, the pattern is followed by the eyes, then the cosine distance is used to measure similarity between the enrolled pattern and the presented pattern. They had 18 subjects with 40 presentations each, giving a dataset containing 720 real presentations. The times for performance on a PC with 16GB RAM and one Intel Dual-Core i7-6600U CPU takes 3.845 seconds. It is not possible to track eye angles on a small screen. The eye changes are very small, making it hard even for a large screen. If a pattern is generated randomly, it effectively prevents video attacks, because each attempt would be random.
Eye-glasses have an effect on PAD. For that reason, the authors in [18] proposed a method based on gaze information. 2D and 3D masks, as well as projected photos, were used for attack scenarios. The authors found that tinted glasses do not have a large impact on detection performance. In the proposed method, a visual stimulus appears on a display to capture data (facial images). To extract features – facial landmark points – Chehra Version 3.0 [19] is used, which extracts 59 landmarks. Moreover, to extract gaze-based features, the variance of the pupil center is measured. It is believed that there is variance between a genuine attempt and an attempt by an imposter. As a result, the variances are used to classify fake and genuine attempts. Eighty users participated in data collection. In addition, 2D mask, 3D mask and projected photo are all considered for PA. The results of the experiment showed that wearing glasses does not impact performance. The true positive rate (TPR) for the 2D mask, projected photo and 3D mask PA in 15 sets of collocated points with tinted glasses were 43%, 51% and 90%, respectively. Whereas the TPR for the attacks when the participants were not wearing tinted glasses were 43%, 57% and 93%, respectively.
Facial recognition was protected against attack presentation by utilizing pupil tracking in [20]. They used a Haar-Cascade Classifier [21] to detect the eye region and the Kanade-Lucas-Tomasi (KLT) [22] approach to track the eye. They then used the EyeMap method to extract the position of the pupil. To evaluate the proposed method, the authors used Yale Face Database part B and achieved a 98% success ratio.
3.2. PAD for iris recognition using eye movement
The authors in [2] believed that the existing liveness detection systems are not automatic nor accurate. They proposed a method based on pupil dynamics. In compliance with the method, the measurement of the characteristic dimensions of the hypothetical pupil is taken on the basis of a sequence of images. The eye is stimulated with light featuring a pre-defined intensity profile. For each image in this sequence, the characteristic dimensions of the hypothetical pupil are calculated by means of image processing methods. For a sequence of images, the system determines a function that defines the changes in the characteristic dimensions of the hypothetical pupil within the measurement period, and then, on the basis of the changes as well as on the selected mathematical model, the liveness parameters of the eye are determined using estimation methods. The calculated liveness parameters are compared with a statistical template by way of a classification process.
In [3] a method for eye liveness detection based on pupil dynamics was proposed. The author showed that the pupil reacts to visible light and its size can vary. Mimicking pupil dynamics for an artificial object is difficult. The author did not use a contact lens or paper printout for evaluation. Therefore, the author tried to classify spontaneous pupil oscillations and normal pupil reaction to a positive surge of visible light. For a better understanding of performance error, the author divided the dataset into two separate subsets, used to train and evaluate a given method. The method was performed on 26 independent subjects. This work presents research results for people who are not stressed and who have not ingested any substance that could modify pupil reaction.
In [4] Iris liveness detection based on gaze estimation is proposed. The authors collected eye movement and iris images from 100 subjects. By creating a hole in the place of the pupil on the printed images, they discovered that it was possible to spoof an iris and eye movement biometric system. To handle this problem, they used gaze estimation based on Pupil Center Corneal Reflection (PCCR) to develop a methodology for print attack detection. Their system comprised a camera and a light source. The BMT_20 Iris Recognition System was used to collect the real iris images and an HP LaserJet 4350dtn gray scale printer was used to collect iris print-attack images. In Spoofing Attack Scenario I (SAS-I), the employed eye tracking system was directly attacked using the prepared iris printouts, both during the calibration procedure and during the main gaze recording phase (stimulus presentation stage). Gaze signals that are captured during such an attack, should carry distortions that arise from both sources. In Spoofing Attack Scenario II (SAS-II), the employed eye tracking system was attacked exclusively during the stimulus presentation stage, whereas the calibration procedure was implemented by a valid (live) eye. This scenario corresponds to the case that the attacker is able to bypass the calibration procedure (e.g., the system is pre-calibrated) and performed the spoofing attack directly during the main process. For these types of spoofing attacks – SAS-I and SAS-II – an equal error rate (EER) of 6.6 and 5.9 respectively was reported. The system is slow and recognition details were not reported.
The authors in [23] presented a PAD method of visible iris recognition. They believed subtle phase information discriminates well between fake and real biometrics. They used Eulerian Video Magnification (EVM) to gather that information. The captured iris video is preprocessed to consist of 30 frames between blink intervals. The registered iris video is decomposed using the Fourier Transform to obtain the phase and magnitude component. The phase component of the iris video is magnified to emphasize the variation in the phase used to make the decision. EVM can magnify the temporal variations in video by decomposing it and applying a temporal filter. They believed that the replayed video presents different information and there will be a lot of noise. They used EVM to enhance small variations in the phase component of the video frame. The video frame was decomposed using Fourier Transform to obtain the phase information, which was then fed through EVM. The decomposed phase information was spatially filtered using Laplacian pyramids and temporally filtered using the Butterworth lowpass filter to magnify the variations in the phase of each frame in the video. The enhanced phase variation in the video was used to estimate the liveness of the subject as normal presentation. All the results related to the proposed scheme of PAD were disclosed in terms of the Attack Presentation Classification Error Rate (APCER) and the Normal Presentation Classification Error Rate (NPCER) [24]. APCER is defined as the proportion of attack presentations incorrectly classified as normal presentations in a specific scenario, while NPCER is defined as the proportion of normal presentations incorrectly classified as attack presentations in a specific scenario, [24]. Further, the authors also disclosed the results in terms of Average-Classification-Error-Rate (ACER), which is described as the average of APCER and NPCER. ACER is obtained on a testing database when different frames starting from 6 to 11 are considered with a threshold of Th = 0.7. The results were indicated from frame number 6, as the frames 1 to 5 were used to make the first decision at frame 6. From the obtained results, the best possible and reliable frame for making a decision was frame number 11, which provided an ACER of 0% for all cases. Table 1 illustrates existing methods for PAD using GTPD methods.
Table 1. An overview of algorithms for GTPD-based PAD methods
Reference
|
Attacks
|
Techniques
|
Algorithm and methods
|
Asad Ali et al. [18]
|
Printed photo, 2D mask, 3D mask
|
Using a visual stimulus to track eyes.
|
Extract facial landmark
|
Chehra Version 3.0 [19]
|
Pupil extraction
|
Propose a method based on variance, min and max between landmarks
|
Meng Shen et al. [17]
|
Printed photo, Replay video, 2D/3D mask
|
Iris movement
|
Attack detection
|
probability-based random pattern generation method
|
Face detector
|
Haar classifiers
|
Killioglu et al. [20]
|
No attack
|
Eye area estimation
|
Eye area estimation
|
Haar-Cascade Classifier
|
Eye-tracking
|
Kanade-Lucas-Tomasi (KLT)
|
Pupils extraction
|
EyeMap method
|
Czajka et al. [2, 3, 25]
|
No attack
|
Using pupil dynamic
|
Pupil extraction
|
Kohn and Clynes [26]
|
Live and fake classification
|
Support Vector Machine (SVM)
|
Rigas et al. [4]
|
Print-attacks
|
Gaze estimation
|
Gaze estimation
|
Pupil Center Corneal Reflection (PCCR)
|
Liveness Detection
|
New version of the Complex Eye Movement (CEM) method
|
Raja et al. [23]
|
Artefact Video
|
video presentation attacks detection
|
PAD method
|
Modified Eulerian video magnification (EVM)
|