Iris Recognition and Liveness Detection Competitions

General Overview

Iris recognition became commercialized in 1999 when LG released the first iris scanner IrisAccess 2200 meant for broad usage. However, iris anti-spoofing issues were barely addressed at the time. In 2005, the first iris recognition challenge dubbed Iris Challenge Evaluation took place. However, the event focused on iris recognition while ignoring the threat of spoofing that was poorly explored at the time. ICB Competition was the next iris-related challenge occurring in 2013. It featured a large selection of environments, illumination conditions and noises that decrease iris recognition accuracy.

The same year LivDet-Iris 2013 competition was hosted. Organized by three institutions including Clarkson University, it had a goal of discovering a solution that was able to detect iris printouts and textured lenses — the typical spoofing attack tools used in real life. The challenge demonstrated that the suggested systems could detect only 88.7% of printed iris photos and 92.73% of textured lenses.

Due to the success of LivDet-Iris, it was hosted three more times in 2015, 2017 and 2020. A unique feature of the challenge is that it is divided into Two Parts for testing software and hardware solutions separately. The need for efficient iris anti-spoofing has rose significantly during the Covid-19 pandemic due to its contactless nature.

Iris Recognition Competitions

There are currently 11 primary challenges dedicated to iris anti-spoofing, with LivDet-Iris being the most significant one.

ICE

ICE or Iris Challenge Evaluation was a competition hosted by NIST in 2005-2006. It was the first challenge dedicated to iris recognition. It majorly focused on the correlation between left and right irises as a way to achieve a more accurate match and non-match similarity scores.

The ICE dataset contained 2,953 iris photos. The image quality offered a 480x640 resolution with the iris diameter exceeding 200 pixels. Interestingly, all images had to be assessed by the LG EOU 2200’s quality check before they could be submitted to the dataset.

The challenge consisted of two stages:

• Experiment 1 focusing on the right iris.
• Experiment 2 focusing on the left iris.

Among the participants were Cambridge University, Chinese Academy of Sciences, Iritech, and others. To calculate similarity score, Receiver Operating Characteristics (ROCs) were used during a verification task. Detailed challenge results can be found here.

NICE

NICE or Noisy Iris Challenge Evaluation was hosted in 2008-2009. The central goal was to develop a solution capable of removing noises that typically pollute iris images: specular reflection, physical/digital occlusions, camera distortion, etc.

The challenge dataset of 500 photos employed ocular images from UBIRIS.v2 — a more comprehensive database that features images shot in uncontrolled environments with accompanying visual noises. NICE focused on small target segmentation

The following error rate was used in the challenge:

$\displaystyle{ \Epsilon_{j}={1 \over nwh}\sum_{i=1}^n\sum_{r=1}^h\sum_{c=1}^wP_{i}(r,c)\otimes G_{i}(r,c) }$

Interpretation:

• $\displaystyle{ \boldsymbol{n} }$ — test image number,
• $\displaystyle{ w }$ and $\displaystyle{ h }$ — width and height,
• $\displaystyle{ P_{i}(r,c) }$ — pixel intensity on row r and column c of the ith segmentation mask,
• $\displaystyle{ G_{i}(r,c) }$ — actual pixel value,
• $\displaystyle{ \otimes }$ — exclusive or-operator.

The UBIRIS.v2 database is still available on demand.

IREX

IREX or Iris Exchange is an initiative launched by NIST to evaluate performance of iris recognition solutions in compliance with two standards: ISO/IEC 19794-6 and ANSI/NIST ITL 1-2007 Type 17. Ten initiatives had been held as of today with IREX 10 being the ongoing one.

IREX VI has received a portion of criticism from E. Ortiz and K. Bowyer who mentioned that iris recognition may be handicapped by the aging process. However, IREX organizers discarded these insinuations, stating that criticism was based on misleading research data, as the mentioned iris size shrinking over three years "could have been realized in minutes via the same manipulation of the ambient illumination".

ICIR

ICIR was a challenge hosted by ICB in 2013. It featured two primary datasets: CASIA-Iris-Thousand for training and IR-TestV1 for testing. The datasets offer 2,000 iris sample classes taken from 1,000 volunteers.

The standard performance metrics used to assess the submitted algorithms included — False Non-match Rate (FNMR), False Match Rate (FMR), Equal Error Rate (EER) and Receiver Operating Characteristic (ROC). Among them were DUT designed for iris localization, DUT with segmentation based on circular Hough transform, and others.ircular Hough transform, and others.

MICHE

Mobile Iris Challenge Evaluation (MICHE) focused on iris recognition that could be orchestrated without specialized gear. For that purpose, a dataset of 3,732 iris images captured with mobile phones was collected. The contest consisted of two stages:

• MICHE I dedicated to iris segmentation.
• MICHE II testing the iris recognition.

The test results were evaluated with two metrics: Recognition Rate (RR) and Area Under the Curve (AUC). A number of algorithms were proposed. For instance, one of them employed Daugman’s rubber sheet model normalization and Hamming distance for separate matching of iris and the periocular region.

MIR

Competition on Mobile Iris Recognition (MIR) hosted by BTAS. Its goal was to enhance security for mobile gadgets via iris recognition supported with Near-Infrared (NIR) Illumination. Again, two datasets were proposed: MIR-Train and MIR-Test, the grayscale images for which were captured indoors and under NIR illumination.

Both datasets offer 16,500 photos collectively with a 1968x1024 resolution. The primary difficulty comes from occlusions such as eyeglasses, lighting variations, differing distances, defocus, etc. The classic False Match Rate (FMR) and False Non-Match Rate (FNMR) metrics were used along with Equal Error Rate (EER) and DI. The winning solution demonstrated FNMR4 = 2:24%, EER = 1:41% and DI = 3:33.

VISOB

VISOB was a 2016 contest on mobile ocular biometrics recognition. For that purpose a special dataset of 158,136 images was created with three smartphones: iPhone 5S, Oppo N1 and Samsung Note 4. Volunteers were instructed to capture their irises under three lighting conditions: office light, daylight, and dim light at a varying distance of 8-12 inches.

The best solution employed periocular feature extraction with the Maximum Response (MR) filters from a set of 38 filters, plus a deep neural network learned with regularized stacked autoencoders. Noise removal was performed with Gaussian filter, histogram filter, and image resizing.

Iris Liveness Detection Competitions

Three challenges focus on iris liveness detection.

LivDet Iris

The biggest competition in that area, LivDet-Iris started in 2013. The latest challenge was in compliance with the ISO/IEC 30107-3 standard guidelines. As a result, it used the Attack Presentation Classification Error Rate (APCER) and Bonafide Presentation Classification Error Rate (BPCER). Additionally, Weighted Average of APCER and Average Classification Error Rate (ACER) were used.

It had no training dataset as participants were allowed to use any training data they could find. For the test phase a dataset was assembled consisting of 12,432 images: 5,331 real and 7,101 fake. The Presentation Attack instruments (PAIs) featured:

• Printed eyes.
• Textured lenses.
• Prosthetic, toy and fake eyes.
• Eye images replayed from Kindle e-Ink.

The winning solution scored a 29.78% ACER rate. It was based on a multilabel CNN network, which was specifically trained to spot textured contact lenses (SMobileNet) and printed images (FMobilNet). As a finishing touch, a multioutput classifier provided liveness detection.

MobILive

The MobILive contest was held in 2013-14 to provide effective liveness recognition for mobile applications. It used such evaluation metrics as False Acceptance Rate (FAR), False Rejection Rate (FRR) and Mean Error Rate (MER). Plus, to estimate errors, additional metrics were used: False Real (FR), True Fake (TF), False Fake (FF), and True Real (TR).

APCER and Normal Presentation Classification Error Rate (NPCER) were also employed to comply with the ISO/IEC 30107 Presentation Attack Detection standard. One of the best solutions relied on local descriptors, such as Local Binary Pattern (LBP), which monitor statistical behavior occurring in small patches of the image.

Cross-Eyed

Cross-Spectral Iris/Periocular Competition or Cross-Eyed was a 2016 contest timed with the 8th IEEE International Conference on Biometrics. It employed such metrics as Generalized False Accept Rate (GFAR) and Generalized False Reject Rate (GFFR) focusing on both iris and periocular areas.

Some optimistic results were returned by the transfer learning borrowed from face recognition and then adapted to periocular images. As for iris recognition, a promising method employs a CNN model with a bank of Pairwise filters. It helps to detect similarity between a pair of photos.