Liveness Detection: Definition and Overview

Liveness detection is a method of detecting biometric attacks by verifying whether a biometric sample captured by a system belongs to a living person or not

Definition & General Overview

Liveness detection is a capability of a biometric system to differentiate falsified biometric traits presented to its sensors from the genuine ones. Liveness detection is, perchance, the central component in biometric security as it prevents fraudulent enrollment attempted with synthetic traits. (These attempts are also referred to as Presentation Attacks or PAs.)

The concept of liveness in biometrics was first introduced by prof. Dorothy E. Denning in her article "'Why I Love Biometrics. It's "liveness," not secrecy, that counts". Denning’s main ideas state that keeping biometric traits in secret is rather ineffective. Instead, a system should rely on the validation process that will be capable of analyzing biometric traits and their liveness signals.

A 3D printed head used as a PA instrument to unlock an Android phone

In liveness taxonomy biometric traits are divided into physiological and behavioral. The former group includes inseparable physiological parameters of the human body: fingerprints, iris patterns, facial contour, vocal characteristics, cardiovascular activity, and so on. The latter group incorporates psychologically predefined traits: gait, handwritten signature, keystroke dynamics, and others.

Liveness detection focuses on both groups. Hence, various modalities — from gait analysis to retinal scanning — have been introduced to minimize spoofing attack probability.

Main Approaches & Techniques of Liveness Detection

Liveness detection is usually implemented as part of biometric spoofing detection. As a subsystem, it serves to enhance robust biometric security. Liveness detection is generally separated into two classes: Hardware and Software-based.

Hardware-based

Hardware-based liveness detection requires additional equipment to be added to a system. Examples include fingerprint scanners, specialized EKG hardware for electrocardiogram monitoring, a stereo camera capable of constructing a three-dimensional face map, an ophthalmoscope for retinal scanning, and others.

However, the usage of extra equipment results in higher financial costs, as well as deployability and maintenance issues. For example, while fundus cameras can provide highly accurate retinal scanning, the budget-wise models cost several thousand dollars.

Holga 120 3D stereo camera for analogue three-dimensional photography — Holga 120 3D stereo camera for analog three-dimensional photography

Invasiveness is another critical issue pertaining to such solutions. They may require a higher level of cooperation from the users, take a longer time to complete the procedure, increase customer friction, and impose physical contact with a sensor.

Software-based

Software-based solutions are somewhat easier to implement. Basically, they are algorithms created with deep learning, as well as training/testing datasets dedicated to a specific biometric trait: fingerprints, voice, face, human iris, etc.

Such approaches focus on extracting anatomical features and analyzing their physical properties: vocal signal power distribution analysis, anisotropic diffusion analysis, 2-D Gabor wavelet iris pattern patch-wise phase quantization, and others.

Apart from being simple to deploy, these approaches are also effective in uncooperative and crowded environments with high throughput: airports, border checkpoints, train stations, shopping malls, and so forth.

Main Types of Liveness Detection

A group of liveness detection modalities is widely deployed in various fields: telehealth, e-commerce, public surveillance, law enforcement, banking, and so on. Here are the most employed liveness detection types:

Facial Liveness Detection

Facial liveness detection is, by far, the most popular modality. This is caused by its usage in both public surveillance and mobile technology. As reported, only 6 countries had no surveillance cameras in 2021, while China employs 540 million cameras in total, or 372.8 cameras per 1,000 people. And 7 in 10 governments use face recognition technology. As for mobile gadgets, it is estimated that 90% of all smartphones will use face recognition by 2024.

The map of facial recognition technology usage in the world

Generally, it’s divided into two categories: active and passive.

Active

Active liveness detection is an interactive, challenge-based method. It requires the user to perform specific actions such as nodding, blinking, or turning their head. This interactive method is designed to ensure that the face presented is a real, live person rather than a photo or video. However, there are challenges with this approach. Firstly, it's susceptible to presentation attacks (PAs), especially with more advanced technologies like deepfakes. Secondly, it can often be inconvenient or cumbersome for users, as it requires their active participation, which could disrupt the user experience or lead to user dissatisfaction. Despite these drawbacks, it provides a level of security that can be useful in settings where the risk of impersonation or identity theft is high.

Passive

Contrary to active detection, passive liveness detection operates discreetly in the background, eliminating the need for user cooperation or interaction. This method uses a combination of deep learning algorithms and specialized hardware to determine the liveness of the face presented to the system. It could involve the analysis of facial features under different lighting conditions or the use of a telescopic camera for remote iris recognition. Passive liveness detection is often considered more user-friendly and less intrusive as it doesn't demand any specific actions from the user. It is particularly suitable for a wide range of applications including anti-spoofing for IoT.

Block diagram of a basic facial liveness detection system

Typically, it includes the following stages:

Acquisition. Accessor’s face is captured with a camera (sensor).
Preprocessing. A captured image is processed with edge detection, scaling, smoothing, and other tools to remove some artifacts and increase the quality.
Classification. Key image and/or facial features are extracted and processed usually through an AI based algorithm.
Verdict. The system accepts/rejects the specimen.

Skin texture, facial contour, and depth (if available), as well as other features, are analyzed with the help of a Support Vector Machine (SVM), Artificial Neural Networks (ANN), linear discriminant analysis, difference degree calculation, Conditional Random Fields (CRFs), etc.

Voice Liveness Detection

Voice liveness detection is avidly used in online banking, the Internet of Things (IoT), audio deepfake detection, as well as secure telephony. The main three attack types in this scenario are speech synthesis based on deep learning, replay attacks, and voice conversion.

Voice spoofing attacks are mostly aimed at Automatic Speaker Verification (ASV) systems. (Although human targets are also at risk.) The main vulnerability of an ASV is that it can fail at detecting an impostor: a fake spectrogram can be almost identical to the genuine one, so the system won’t notice the spoof during the feature extraction phase.

A number of methods are proposed to prevent voice spoofing. Among them are analysis of cumulative power patterns, pop noise analysis, calculation of detection scores with the bitmaps and spectral peaks, and so on.

Bitmaps extracted from genuine and replay attack audios

Fingerprint Liveness Detection

Fingerprint liveness, perhaps, is the only modality that can compete in popularity with facial liveness. At the same time, its proliferation makes it an attractive target for Presentation Attacks. Fingerprints can be artificially recreated with a cavalcade of easy-to-find materials: gelatin, latex, matte paper, silicone, etc. Severed/dead finger scenario is also considered a threat to a degree.

Creation of a fake fingerprint with a printed circuit board (PCB)

Numerous methods are proposed to detect fingerprint liveness: vitality characteristics analysis, Optical Coherence Tomography (OCT), dynamic behavior of human live finger analysis, check-up of the finger skin elasticity, and others.

Architecture of a software-based fingerprint liveness detection solution

Iris Liveness Detection

The human iris can also be forfeited with a printed or digital photo, video, textured lens, and even a prosthetic eye. The Cadaver eye scenario should not be discarded either. Static and dynamic PAD methods are proposed in this case. They include multispectral imaging, electrooculography, photo artifacts detection, etc.

A high-quality prosthetic eye with realistic iris patterns and coloring

Live human iris (left) and a printed iris (right)

Multimodal Liveness Detection

Multimodal liveness detection combines two or more biometric modalities, such as fingerprint + facial recognition. Key benefits of such a system are a higher accuracy and a better ‘immunity’ to technical failure — in case one of the modalities goes defunct, its counterpart will let the system continue working. Besides, attempting to spoof more than one modality requires a double amount of resources and effort, while also diminishing the success chances for the malicious actors.