Deepfake Detection Competitions
Deepfakes: Harmful Potential & Detection Competitions
Deepfakes are a type of falsified media — audio or visual — produced with deep learning. They employ techniques — such as face swapping — to mimic a target person’s appearance or voice. The technology can be traced back to 1997 when the first digital face manipulation tool Video Rewrite was presented by the Interval Research Corporation. In 2017 deepfake videos emerged as a mainstream threat when their production tools became widely available to the common users.
Currently, deepfakes are considered as a serious threat based on AI and machine learning technologies. Crime Science reports that deepfakes are capable of producing devastating societal harm: from political slander to petty money thefts via realistic impersonations.
As a countermeasure, deepfake detection competitions were proposed to engage anti-spoofing professionals and amateurs in battling this growing threat. Deepfake Detection Challenge 2020 is considered to be the biggest global event targeting deepfakes, with a $1 million prize pool, numerous applications and an extensive deepfake dataset featuring 3,500 paid actors.
Its goal is to find an effective, easy-to-deploy, and highly accurate tool that will detect falsified media quickly, while also providing a full-scale evidence report.
The emergence of deepfakes and its associated harms has spurred regional and international alarm. For instance, the European Parliament published a study, Tackling Deepfakes in European Policy. The document lists the risk categories brought on by the technology including bullying, extortion, identity theft, election manipulations, and so on.
Deepfake Detection: Competitions, Goals & Results
In order to spread awareness on deepfake harms and find creative countermeasures against it, deepfake competitions were organized in the USA, China, Singapore, and European countries: during the Irish Young Scientist & Technology Exhibition (BTYSTE), a novel deepfake detection method was proposed by a school student, which debunked the fake Queen’s speech with a "94.3% confidence".
There are a number of leading challenges and competitions, which set the goal of successful deepfake detection.
FaceForensics++
FaceForensics++ is a large independent deepfake dataset, which allows training various deepfake detectors. At the moment, it contains 1,000 video sequences with 509,914 images used as bona fide visual data.
As for the digital face manipulations, it employs 4 methods:
- FaceSwap. The method superimposes face regions from a source video to the destination video by using facial landmarks detection, 3D templates, blendshaping, and color correction.
- Face2Face. It involves facial reenactment by translating the facial expressions from a source video to the destination video. This is done using video input streams and manual keyframe selection.
- Deepfake. Basically, any media, static or live, that was doctored using a deepfake app such as Face Swapper, DeepAR, FaceSwap, FakeApp etc.
- Neural Textures. It involves tracked facial geometry obtained with photometric reconstruction loss coupled with an adversarial loss, and other techniques.
The FaceForensics++ challenge welcomes authors and researchers to test their detection methods with their extensive database. The results need to be submitted to the evaluation server. All FaceForensics++ test data and evaluation results are publicly available.
DFDC
Deepfake Detection Challenge or DFDC is, by far, the biggest deepfake detection contest. Hosted by Kaggle in partnership with AWS, Facebook, and The Partnership on AI, it currently engages the highest number of applications (2,256 teams).
The competition lasted from March to June 2020, offering $1 million prize pool. The winning team or researcher was offered half of the prize sum.
DFDC has a massive dataset of 128,154 consented videos. It is segregated in 4 parts:
- Training set. It is available for download outside the event.
- Public Validation set. It comprises 400 select videos.
- Public Test set. A sort of leaderboard where submitted code is ranked.
- Private Test set. It contains organic videos for the final testing.
The competition had its own application requirements for code, as well as an evaluation algorithm.
[math]\displaystyle{ LogLoss=-\frac{1}{n}\sum_{i=1}^n[y_i\log\bigl(\hat{y_i}\bigr)+\bigl(1-y_i\bigr)\log\bigr(1-\hat{y_i}\bigr)] }[/math]
Translation:
- n — The number of videos being predicted.
- yi — Predicted probability of a video being fake.
- yi — 1 if the video is fake, 0 if real.
- log() — The natural (base e) logarithm.
Deeper Forensics Challenge 2020
Deeper Forensics Challenge 2020 focuses on the "real-world face forgery detection". As a groundwork of the competition, it employs a massive dataset DeeperForensics-1.0 with 60,000 videos that include 17.6 million frames.
The challenge lasted for 9 weeks, 8 of which were dedicated to the development phase. The final week was about the testing and evaluation. Like DFDC, the Deeper Forensics challenge applied the binary cross-entropy loss (BCELoss) to attest the competing solutions:
[math]\displaystyle{ BCELoss=-\frac{1}{N}\sum_{i=1}^N[y_i\cdot\log\bigl(p\bigl(y_i\bigr)\bigr)+\bigl(1-y_i\bigr)\cdot\log\bigl(1-p\bigl(y_i\bigr)\bigr)] }[/math]
Translation:
- N — Quantity of videos in the hidden test set.
- yi — 1 for a fake video, 0 for a real one.
- p (yi) — Denotes the predicted probability that video i is fake.
A total of 25 teams took part in the contest. The winning team’s detection strategy includes three steps:
- Face Extraction. It involves extraction of 15 frames with VideoCapture. Then the frame-by-frame face region detection comes into play with MTCNN.
- Classification. Determines the probability of a video being fake. Three EfficientNet models — B0, B1, B2 — play a major part.
- Output. The average of the face scores for the extracted frames is estimated, predicting a probability that the video is falsified.
DFGC 2021
Deepfake Game Competition 2021 is also a two-stage contest: it includes Deepfake Creation and Deepfake Detection tracks. For the Creation track contestants were tasked to create 1000 face-swap images. In the Detection track the task was to create solutions based on the Celeb-DF dataset.
This dataset contains 5,639 deepfake videos in high-quality. While the created deepfakes were attested with some similarity metrics, the detection solutions were examined with The Area Under the Curve and Receiver Operating Characteristics (AUC ROC).
Winners in the Creation track applied Faceshifter and adversarial noise to enhance the antidetection qualities. In the Detection track the winning solution involves EfficientNet-B3 and MTCNN.
OpenMFC
Open Media Forensics Challenge is hosted by the National Institute of Standards and Technology (NIST).
Its primary goals are:
- IMDL. Image Detection and Evaluation to detect any possible doctoring of the test image.
- VMD. Video Manipulation Detection to discover video manipulations.
- IGMDL. Image GAN Manipulation Detection and Localization for detecting Generative Adversarial Network intervention.
- VGMD. Video GAN Manipulation Detection uses the previous approach for videos.
It is based on a system of leaderboards, which are updated as soon as a new submission is made. This makes comparison of the existing methods easier.
ForgeryNet
ForgeryNet is based on the eponymous dataset with 2.9 million images and 221,247 videos. Besides, there are currently 7-level image and 8-level video approaches, which makes it a great source for training AI-based solutions.
AUC ROC curve is used for evaluation with True Positive Rate (TPR) and False Positive Rate (FPR) values. In turn, this adds two classes: Positive for falsified media and Negative for bona fide images and videos.
The winning deepfake detection solution included the following elements:
- DFQ. Dynamic Feature Queue guarantees a more comprehensive metric space for metric learning, which ensures that model training remains stable.
- Global Patchwise Consistency. It is employed as a cue to detect fake media without face detection.
- Video Track Processing. Provides rapid forgery verification, as well as temporal localization by sampling the video in question.
The solution is based on ResNeSt-50. However, face detection is omitted to save resources and time: instead the whole frame or image is analyzed.
FAQ
Are There Any Deepfake Detection Experiments or Competitions?
Researchers and companies regularly host experiments and contests dedicated to deepfake detection.
A number of experiments and competitions are dedicated to deepfake detection to boost awareness of the issue its impact on biometric security. For instance, a group of experiments focused on how good human performance is if compared to the machine algorithms.
The results of the experiment revealed that tools based on machine learning show high accuracy at distinguishing fake and real faces. The experiments were held by institutions like Stanford, University of Lincoln, MIT, and others.
Global deepfake detection challenges have shown that Convolutional Neural Networks (CNNs) and EfficientNet in particular provide generally high accuracy. CNN-based detection can be applied in various fields: from IoT to law enforcement.
What is DFDC?
DFDC is a large-scale deepfake detection challenge.
Deepfake Detection Challenge (DFDC) is one of the most prominent deepfake detection contests. With a prize pool of $1 million, it saw 2,256 applicants. The event was organized and sponsored by Kaggle, Facebook, and other major stakeholders.
DFDC's dataset had 128,154 videos. Contestants were tasked with developing an antispoofing tool that could successfully detect deepfakes and differentiate them from authentic media. The winning solution showed a 65.18% accuracy when applied to the "black box" dataset. It was MTCNN-based and included an albumentations library, EfficientNet-B7 encoder, and other components. (See here).
What are the Main Deepfake Detection Competitions?
There are a number of key deepfake detection contests.
A large number of challenges have been proposed over time to raise awareness and discover the most effective tools to detect fabricated media: from audio to video. Deepfake Detection Challenge (DFDC) is one of the most recognizable events: with a $1 million prize pool and more than 2 000 participants. The winning solution was an MTCNN combined with an EfficientNet encoder.
Other notable events include FaceForensics++, DeeperForensics Challenge 2020, Deepfake Game Competition 2021, Open Media Forensics Challenge, ForgeryNet, etc. Interestingly, every winning liveness detection tool in the listed contests was based on a Convolutional Neural Network.
Is Deepfake Detection Successful with Neural Networks?
Convolutional neural networks have proven to be an effective method of detecting deepfake media.
Antispoofing researchers and deepfake detection challenges show that neural network tools are highly effective.
The winning solution in DFDC contest was based on a Multi-task Cascaded Convolutional Network (MTCNN). Other challenges saw a successful performance from the EfficientNet models. This CNN architecture steadily shows impressive results: 84.3% accuracy.
CNNs demonstrate a high potential in detecting falsified media. Owing to their three-part architecture — convolutional, pooling and fully connected layers — they are best suited for interacting with various data, including videos and human speech.
References
- Video Rewrite: Driving Visual Speech with Audio
- ‘Deepfakes’ ranked as most serious AI crime threat
- Deepfake detection contest winner still guesses wrong a third of the time
- Celeb-DF: A New Dataset for DeepFake Forensics
- Tackling deepfakes in European policy
- Young Scientist: Cork student wins with programme to detect ‘deepfakes’
- Channel 4’s ‘deepfake’ Queen’s speech sparks hundreds of complaints to Ofcom
- FaceForensics++: Learning to Detect Manipulated Facial Images
- How To Install FakeApp
- Photometric reconstruction loss
- DeepFake Detection on paperswithcode.com
- The Partnership on AI Steering Committee on AI and Media Integrity
- Sample from the DFDC deepfake dataset
- Deepfake Detection Challenge
- Handbook of Digital Face Manipulation and Detection From DeepFakes to Morphing Attacks
- DeeperForensics-1.0
- Simple Neural Network with BCELoss for Binary classification for a custom Dataset
- Celeb-DF: A Large-scale Challenging Dataset for DeepFake Forensics
- AUC ROC
- DFGC 2021: A DeepFake Game Competition
- Open Media Forensics Challenge
- National Institute of Standards and Technology
- Deepfake data used in OpenMFC
- ForgeryNet - Face Forgery Analysis Challenge 2021: Methods and Results
- ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
- ResNeSt-50