When Did GenAI Appear?

Andrey Markov, a Russian mathematician whose work paved the way for GenAI

The foundation for GenAI was first laid with the introduction of the Markov chain (1906-13) — a statistical model, which implies that new data sequences can be created based on the current sample data.

But it wasn’t until the 1950-60s when Generative AI took the physical form: Rosenblatt’s Perceptron, LILLAC I experiment, and later the 1960’s chatbot Eliza.

Preliminary Stage

Perceptron

The image-classifying Perceptron I designed by Frank Rosenblatt in 1957 is considered the first ever instance of AI, on the principles of which all ensuing generations of artificial Intelligence are based. It employs a MacCulloch-Pitts concept of the artificial neural network (ANN) published in 1943.

Markov chain used in weather prediction with the matrix arithmetic

Iliac Suite

The first known example of synthetic content was produced with the ILLIAC I mainframe computer: it composed the Lilac Suite for a string quartet — an experiment run by L. Hiller and L. Isaacson. In 1960 the Soviet mathematician Rudolf Zaripov published a paper, in which he analyzed how a melody could be constructed algorithmically.

ELIZA

Known as the “first chatbot in history”, ELIZA was designed in 1964-67 by the MIT researcher Joseph Weizenbaum. It was operated by the DOCTOR script imitating Carl Roger’s psychotherapy approach, which includes reverting patient’s statements back to them. ELIZA was also enhanced with a teaching mode, which allowed users to affect its behavior.

Example of a conversation with the 1960’s chatbot Eliza

Hopfield Network

Introduced in 1982, the Hopfield network isn’t a GenAI model per se. However, based on the recurrent architecture with the Lyapunov function added, it’s capable of emulating human associative thinking, as well as memory recall mechanism.

Example of the discrete Hopfield neural network

Long Short-Term Memory

Long Short-Term Memory (LSTM) is a variant of a Recurrent Neural Network (RNN) that tackles the vanishing gradient issue — a problem when gradients that should recondition a neural model during a training stage become too small. LSTM is widely used for sentiment recognition, video analysis, language modelling, etc. It was first introduced in 1995 in a paper by Jürgen S. and S. Hochreiter.

Modern Stage

Variational Auto-Encoders (VAEs)

Introduced in 2013 by D. P. Kingma and M. Welling, VAE is a generative model that is capable of synthesizing unique data from the previously sampled input. First, it encodes the input data into the distribution within the probabilistic latent space and then decodes it back to create new data.

Generative Adversarial Networks (GANs)

GAN is a model consisting of two main elements: generator and discriminator. They “compete” with each other to create the most realistic output. While the generator keeps producing new samples, the discriminator compares them to the training data from the real world and decides whether they look “real” or “fake”. GANs — developed by Ian Goodfellow et al. in 2014 — were responsible for the “deepfake invasion” in 2017.

WaveNet

Emerging in 2016, WaveNet is one of the first models capable of imitating human voice in a convincing manner. It is based on dilated causal convolutions, same as PixelCNN, to solve the long-range temporal dependency problem.

Transformer

Transformer, first described in 2017 by I. Polushkin et al., is a self-attention model that focuses on the tiny and distant bits of the input data to analyze how they interact with each other. The model is used for translating texts and also foreshadowed the emergence of the GPT models.

Generative pre-trained transformer (GPT) is based both on the Transformer and Pre-trained Transformer models that were not generative in nature. It was enhanced with unsupervised learning, in which training data is unlabelled. It was introduced by OpenAI in 2018.

NeRF

Representing Scenes as Neural Radiance Fields (NeRF) is a model capable of synthesizing complex scenery. It’s based on querying 5D coordinates for generating scenery images, while its output consists of volume density and view-dependent emitted radiance.

DALL-E

Based on the GPT-3 architecture, DALL-E consists of 12 billion parameters that allow it to interpret text and create images upon written user prompts.

GitHub Copilot

In 2021 GitHub together with OpenAI launched a project dubbed CoPilot. It’s an assistant tool run by the Codex system and it can advise programmers on how to develop their code, by suggesting more fitting lines, functions, or other elements.

MusicLM

MusicLM is a model designed for music synthesis upon prompt. Its architecture includes w2v-BERT to generate semantic tokens for audio, the audio-embedding network MuLan to enhance training, and other components.

Sora

Sora is a diffusion-transformer model designed for video synthesis. It draws inspiration from how Large Language Models (LLMs) are trained on massive amounts of data.

History of GenAI and AI-Generated Content

When Did GenAI Appear?

Preliminary Stage

Modern Stage

Sign up with email

Check your inbox