Active Facial Liveness Detection

From Antispoofing Wiki

Active facial liveness detection rivals the passive detection type, but many experts see it as an obsolete security measure.

Brief Overview

Face recognition systems have been widely used since at least 2002: during the Super Bowl, this technology was put to a test by law enforcement agencies (even though it returned unsatisfactory results). The issue of liveness was first addressed by Dorothy E. Denning in her article It's "liveness", not secrecy, that counts. The idea was that a system should be able to check if the accessor is actually a living person and not his/her lifeless substitute: photo, mask, sculpture. Active facial liveness detection predates passive detection. In essence, it is challenge-based, prompting a person to perform a certain task like smile, blink, turn their head, follow an on-screen object with their eyes, and so on.

However, malicious actors soon learned to overcome these barriers. A rich repertoire of crime tools was developed — from eye-holed masks to sophisticated face swap animations. Moreover, the passive method is known to cause friction to the verification process. It can be compared to entering a password or solving a captcha puzzle. This friction factor leads to "customer frustration" and can result in verification abandonment. As a result, a "successor system" was designed based on passive liveness detection. While it appears more flexible and user-friendly, it also remains vulnerable to spoofing attacks that mimic the liveness factor.

Liveness Indicators

It is logically assumed that a fake face presents characteristics that are unnatural to a human face. Based on this assumption experts highlight the following liveness indicators.

Texture analysis

By default, an artificial face would present unnatural texture patterns. Texture analysis aims to extract features from the presented face to identify their "naturalness". This analysis includes three categories:

  • Parameters. This involves Image Quality Assessment (IQA), which serves to spot errors in the input visual data. The parameters that undergo analysis include Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR), Normalized Absolute Error (NAE), Total Edge Difference (TED), and others.

  • Dynamic texture. This method is based on detecting spatio-temporal Local Binary Patterns (LPB). It’s used for inspecting facial structure with the additional assistance from Three Orthogonal Planes (TOP) parameter.

  • Static texture analysis. This method depends on analyzing gray scale and color textures, as well as indexing the values for the planes.

Currently, texture analysis is considered by experts to have great potential in liveness detection.

Motion analysis

It is estimated that flat or planar objects — like a printed cutout — move differently than a 3D human face. Generally, speaking, motion analysis implies the calculation of information regarding the moving points of an image in the scene, as written here.

There are three main elements in this technique:

  • Focus distance-based. It includes the Depth of Field (DOF) parameter, which is used for detecting the distance between the closest and the farthest objects in the image. Thus, the camera focus and illumination can testify that the image belongs to a bona fide user.
  • Optical flow-based. This method calculates the optical flow by creating an image frame sequence. Every single frame is converted into vector data, In turn, this allows to pixel directions and velocity motion between every frame.
  • Scenic clues-based. In this approach three scenic clues are studied. Non-rigid motion clue pays attention to blinking, lip movement, wrinkling, etc. Face-background consistency implies that motion of both face and background can vary in consistency: high for fake images and low for pristine. And imaging banding effect indicates that fake images will display certain defects. Scenic clues-based analysis employs such tools as GMM-based motion detection method, wavelet decomposition, and image alignment based on low-rank matrix decomposition.

Life sign indicators

This approach echoes scenic clues-based analysis. For instance, it pays attention to blinking as a categorical life sign. This technique analyzes a sequence of input images to firstly, detect eyes and secondly, calculate eye region variations. As a result, the system can accurately tell whether the person is real or not. Another major life sign is the mouth state. Conditional Random Fields (CRMs) and various types of discriminative models are used to determine whether the images are fake or genuine by studying lip movement.

3D properties

As a 3D object, human face is supposed to have certain depth, as well as curvature. This technique can acquire visual data to detect whether the object lacks surface variation — a value that a real face will display.

The following formula is used in this method:

[math]\displaystyle{ C=\frac{(p-b)*v}{d^2} }[/math]


  • C — value of the curvature.
  • d — the mean distance of all points within Ωr
  • p — point of approximation of the actual curvature value.
  • v — eigenvector corresponding to the smallest eigenvalue of the decomposition.
  • b — baricenter of the Cartesian coordinates of the points within Ωr (spherical neighborhood)

Typically, the mean curvature retrieved from a human face is bigger than that calculated from a fake image. This can serve as a benchmark to identify fake faces.

Content-based Analysis

Content-based analysis is a promising technique and employs upper-body (UB) and a spoofing medium (SM) detectors. They in turn rely on Histogram of Oriented Gradient (HOG) descriptors, as well as linear support vector machines (SVM). The idea behind this approach is that a person can spot fake media using the context and scenery presented. Thus, the algorithm analyzes the scenic cues of a video, such as face/shoulders/torso alignment. It is trained with a dataset of deepfake videos and showed impressive results: 3.3% - 6.8% error rate.

Morphological Operations-based Analysis

This method studies opening/closing of the subject’s profile silhouette. Profile shapes are used for creating vector data. The technique also employs face rankings based on the Euclidean distances, and so on.

Other Indicators

Other known liveness indicators mostly involve combinations of the above-mentioned methods. More literature on the topic can be found here.

Disadvantages of Active Facial Liveness Detection

As discussed previously, active facial liveness detection has a number of weaknesses including attack vulnerabilities, increased customer friction, additional expenses, and so on. Researchers highlight the following major disadvantages of the active detection:

Poor customer experience

Statistics show that at least 18% of clients abandon their carts due to active liveness detection which makes the checkout procedure lengthy. Another example indicates that 40% of customers stop the on-boarding process in the retail banking sphere because of the same reason. If used commercially, active liveness detection can significantly increase the wait time. In turn, this will undermine overall customer satisfaction.

Complicated process

Active detection approach requires additional software to be installed on the user’s gizmo. It also needs more data capacity to send the captured images to the servers for further analysis. Therefore, the system may pose challenges in areas where Internet connection is poor or costly.


Con artists have adapted to spoof active detection using various techniques. They use Presentation Attack Instruments (PAIs) that vary in quality and ingenuity: from printed photos to elaborate silicone masks and deepfake generators.

Lack of standardization

Active detection has no generally accepted industry standards. This implies that every company or institution faces their own challenges to verify a person and requires unique countermeasures.

Attempts to Bypass Active Facial Liveness

Impostors can bypass a challenge-based system with various tools. These include:

  • An app like Face Swap Live.
  • 2D and 3D masks with empty eyeholes.
  • A deepfake with the target’s face performing a required task.

As long as a PAI allows to blink, rotate head, smile or move eyes, there’s a chance that the attack against the active system will be successful.

An interesting experiment was held at the university of North Carolina. It showed that it is possible to create a synthetic 3D face model, using a collection of photos of any given person. This involves transferring the facial texture, gaze correction, animations for creating facial expressions, and so on.


  1. History of Face Recognition and Facial Recognition software
  2. Dorothy E. Denning - Wikipedia
  3. It's "liveness", not secrecy, that counts
  4. How active liveliness different from passive vikram sareen
  5. Dorothy E. Denning, author of the liveness concept
  6. Face Liveness and Spoof Face Detection and Role of Different classifiers — An Image Processing Perspective
  7. Definitions of the three orthogonal planes
  8. 1: Human head and its body fixed frame. A: Three orthogonal planes are defined-sagittal, coronal and horizontal. B: Head can be rotated and translated in three orthogonal directions.
  9. Insight on face liveness detection: A systematic literature review
  10. Depth of Field
  11. Face liveness detection by exploring multiple scenic clues
  12. Liveness detection based on 3D face shape analysis
  13. An Overview Of Face Liveness Detection
  14. 30+ Shopping Cart  Abandonment Statistics and Strategies for Recouping Lost Sales
  15. The debate over active or passive liveness detection and frictionless biometrics
  16. Facial Liveness Detection: An Essential Biometric Layer To Improve Security And The User Experience
  17. Face Swap Live
  18. A believable silicone mask with eyeholes
  19. Creating a 3D image from a dataset of 2D photos