Deepfakes: Brief History
The first technology similar to facial deepfakes appeared in 1997 when the Video Rewrite tool was presented. It was based on automatic phoneme labeling that allowed matching an already existing footage to a new soundtrack. The tool was successfully applied to alter a few bits from a John Kennedy’s speech.
Prior to that, Artificial Neural Networks (ANNs) — envisioned in 1943 by McCullouch and Pitts — had a resurge in the 1980s when metal–oxide–semiconductors (MOS) and complementary MOS (CMOS) were introduced: they provided more computational power. Besides, a number of research facilities ventured in developing ANNs after a series of works, including John Hopefiled’s paper, were published on that topic.
In 2006, deep learning was established as a primary method for training Artificial Intelligence (AI). And in 2014 Generative Adversarial Network (GAN) was introduced by Ian Goodfellow et al. The proposed architecture could automatically generate images, videos and audios. Shortly afterwards, various GAN iterations became open-source. In 2017, a GAN-based face-swapping tool was used to create fake pornographic materials by a Reddit user nicknamed Deepfake.
Consequently, a diverse number of applications and websites made deepfake technology accessible to common users. Among them are MyHeritage, DeepFaceLab, SteosVoice, and others. While most of them focus on entertainment or practical application — such as old footage restoration — many can also be used for fraudulent and harmful purposes.
Deepfake Websites: Technical Details
To produce a highly realistic deepfake image, the following elements are employed:
The core elements of a GAN system include:
- Noise signals.
- Image generator, which creates images from random noises.
- Discriminator that decides whether an image is authentic or artificial.
Since generator and discriminator are set to rival each other, the process becomes generative and adversarial at the same time. A training phase precedes image generation, during which a large amount of real samples is fed to the network.
This is just a basic framework. A GAN architecture may differ and comprise extra elements, depending on its model: Deep Convolutional GAN (DCGAN), Cycle-GAN, Information Maximizing Generative Adversarial Network (InfoGAN), and others as a result, an advanced GAN-based architecture can successfully imitate liveness in images or videos.
To make an image realistic, various components are applied: bilinear up/downsampling operations, tuned hyperparameters, mixing regularization that provides minute control over created images, while using two random latent codes, Fréchet inception distance (FID) and Precision and Recall (P&R) metrics to ensure image quality analysis, adaptive instance normalization (AdaIN), etc.
Stochastic details — such as facial hair, freckles, eyeglasses, wrinkles — can be realized through spatially-varying pseudorandom numbers generation. For example, StyleGAN achieves this effect by adding per-pixel noise after each convolution.
Deepfake Photo Generators
Currently, a large number of web platforms are available which are capable of producing highly realistic fake images. They are used for various purposes: from education to creating fictional characters and assembling training deepfake datasets.
This Person Does Not Exist (TPDE)
This Person Doesn’t Exist was launched by Phillip Want, who created the website for demonstration purposes. His initial goal was to attract a group of experts to research AI-related issues. However, the website was made available to the broad public to raise awareness concerning dangers posed by deepfake media.
The face generator is powered by Nvidia’s StyleGAN, while being visited by 15,845 unique people daily. It has also spawned a number of clones dedicated to cats, rentals, job applicant resumes, vocabulary, and so on.
Generated Photos is a commercial website, which allows generating and customizing nonexistent faces. It has a minimalistic toolkit, which enables users to tweak facial expression, skin tone, hair color and length, gender, head positioning, etc.
Masque.ai is a service with a limited set of options: it allows selecting age, gender and race. It is powered by StyleGAN 2. Unlike the images produced by TPDE’s, Masque.ai protects its licensable images with a watermark.
Founded in 2020, Ganvatar serves to synthesize images that can be used as game assets, realistic avatars for social media, modeling longitudinal medical imagery, or as pictures with ‘super-resolution’. Three parameters — age, gender, emotion — can be personalized using the application.
Deepfake Photo Contests
Deepfake photo contests have been launched in recent years to discover promising solutions for detecting fake imagery.
Which Face Is Real?
This competition was established in 2019 by authors Jevin West and Carl Bergstrom. It aimed help regular people detect falsified imagery. Among other features, WFIR’s website provides a guideline on how to detect GAN-generated photos. Forgery indicators include image blobs, bizarre backgrounds, facial asymmetry, crooked eyeglasses, smudged-looking hair, etc.
Human or AI?
Human or AI was a gamified challenge, which welcomed users to try and guess real photos from fabricated ones. It comprised two datasets released by Nvidia with authentic and fake pictures. The website is currently inactive.
YouTube's anti-spoofing community has also focused on the issue of deepfakes and several channels have been dedicated to explore this technology.
Ctrl Shift Face
Ctrl Shift Face is a channel dedicated to digital facial manipulations, breakdowns of deepfake visual effects, as well as ‘prank’ videos where lead actors from a certain movie are replaced with the others. The channel demonstrates how high-quality CGI effects can be done using typically simple open-source code.
EZRyderX47 is a channel launched by a Canadian self-taught deepfake creator. The channel focuses on ‘movie remixes’ where original actors undergo facial editing to achieve likeness to a role.
BabyZone is another channel that, along with gameplay walkthroughs, uploads remixes of movies and video game trailers/cut scenes. The videos are created with the help of DeepFaceLab 2.0 used for training and character modeling.
Derpfakes or James Southgate, is a professional visual effects artist, who allegedly contributes to the Deep Voodoo studio. (Founded by T. Parker and M. Stone.) His channel offers a variety of entertaining videos and deepfake production tutorials.
Tero Karras FI
Tero Karras FI is a researcher working at Nvidia Research. His channel demonstrates capabilities of style-based GAN architectures.
Dr. Fakenstein (Peter White) is a self-taught AI artist from New Zealand. He creates deepfakes on a computer assembled from spare parts that were originally used for Bitcoin mining. His channel focuses on entertainment.
Birbfakes is a minor YouTube channel that features videos made with face swapping. It is unspecified which ANN is used for the purpose.
Other Notable Bloggers
Apart from YouTube, deepfakes are also widely used on other social media.
Azusa Gakuyuki is a Japanese male blogger, who gained online recognition after posing as a 20-year-old female, while being 50-year-old at that time. He orchestrated the hoax with the FaceSwap application
Detecting Deepfakes & Spoofing Online
Several deepfake detection initiatives to confront deepfake proliferation have been launched — such as CT2PA or CAI. One of such initiatives is FotoForensics.com, which is a nonprofit organization dedicated to detecting and exposing fake imagery. It is based on a concept of another free service errorlevelanalysis.com created in 2010 (discontinued), where users could submit pictures for evaluation with the Error level analysis (ELA). Read more at Deepfake Detection Software: Types and Practical Application.
- Video Rewrite: Driving Visual Speech with Audio
- A logical calculus of the ideas immanent in nervous activity
- Analog VLSI implementation of neural systems
- Hopfield network
- A Concise History of Neural Networks
- A fast learning algorithm for deep belief nets
- Representation of the first GAN architecture
- Generative Adversarial Nets
- Color Restoration for Photos with MyHeritage In Color™
- "Deep Fakes" using Generative Adversarial Networks (GAN)
- A Style-Based Generator Architecture for Generative Adversarial Networks
- Analyzing and Improving the Image Quality of StyleGAN
- This Person Doesn’t Exist
- This-person-does-not-exist.com ThisPersonDoesNotExist - Random AI Generated Photos of Fake Persons
- This Rental Does Not Exist
- This resume does not exist
- This Word Does Not Exist
- Deepfake image generated at This Cat Doesn’t Exist
- FAQ Generated Photos
- Which Face Is Real? Seeing through the illusions of a fabricated world
- "Calling Bullshit. The Art of Scepticism in a Data-Driven World" by Jevin D. West, Carl T. Bergstrom
- Human or AI
- Human Or AI. Can you guess which image is of a real person vs AI?
- Ctrl Shift Face on YouTube
- EZRyderX47 on YouTube
- Robert Downey Jr. in a Back to the Future deepfake remix on YouTube
- BabyZone on YouTube
- DeepFaceLab 2.0 on GitHub
- James Southgate. Biography on IMDb
- Tero Karras FI on YouTube
- Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks
- Dr. Fakenstein on YouTube
- Birbfakes on YouTube
- Face swapping by Birbfakes applied to a Jennifer Lawrence’s speech
- Azusa Gakuyuki on Twitter
- Young Female Social Media Influencer Outs Herself as 50-Year-Old MAN
- Error level analysis (ELA)