When Did GenAI Appear?

The foundation for GenAI was first laid with the introduction of the Markov chain (1906-13) — a statistical model, which implies that new data sequences can be created based on the current sample data.
But it wasn’t until the 1950-60s when Generative AI took the physical form: Rosenblatt’s Perceptron, LILLAC I experiment, and later the 1960’s chatbot Eliza.
Preliminary Stage
- Perceptron
The image-classifying Perceptron I designed by Frank Rosenblatt in 1957 is considered the first ever instance of AI, on the principles of which all ensuing generations of artificial Intelligence are based. It employs a MacCulloch-Pitts concept of the artificial neural network (ANN) published in 1943.

- Iliac Suite
The first known example of synthetic content was produced with the ILLIAC I mainframe computer: it composed the Lilac Suite for a string quartet — an experiment run by L. Hiller and L. Isaacson. In 1960 the Soviet mathematician Rudolf Zaripov published a paper, in which he analyzed how a melody could be constructed algorithmically.
- ELIZA
Known as the “first chatbot in history”, ELIZA was designed in 1964-67 by the MIT researcher Joseph Weizenbaum. It was operated by the DOCTOR script imitating Carl Roger’s psychotherapy approach, which includes reverting patient’s statements back to them. ELIZA was also enhanced with a teaching mode, which allowed users to affect its behavior.

- Hopfield Network
Introduced in 1982, the Hopfield network isn’t a GenAI model per se. However, based on the recurrent architecture with the Lyapunov function added, it’s capable of emulating human associative thinking, as well as memory recall mechanism.

- Long Short-Term Memory
Long Short-Term Memory (LSTM) is a variant of a Recurrent Neural Network (RNN) that tackles the vanishing gradient issue — a problem when gradients that should recondition a neural model during a training stage become too small. LSTM is widely used for sentiment recognition, video analysis, language modelling, etc. It was first introduced in 1995 in a paper by Jürgen S. and S. Hochreiter.
Modern Stage
- Variational Auto-Encoders (VAEs)
Introduced in 2013 by D. P. Kingma and M. Welling, VAE is a generative model that is capable of synthesizing unique data from the previously sampled input. First, it encodes the input data into the distribution within the probabilistic latent space and then decodes it back to create new data.
- Generative Adversarial Networks (GANs)
GAN is a model consisting of two main elements: generator and discriminator. They “compete” with each other to create the most realistic output. While the generator keeps producing new samples, the discriminator compares them to the training data from the real world and decides whether they look “real” or “fake”. GANs — developed by Ian Goodfellow et al. in 2014 — were responsible for the “deepfake invasion” in 2017.
- WaveNet
Emerging in 2016, WaveNet is one of the first models capable of imitating human voice in a convincing manner. It is based on dilated causal convolutions, same as PixelCNN, to solve the long-range temporal dependency problem.
- Transformer

Transformer, first described in 2017 by I. Polushkin et al., is a self-attention model that focuses on the tiny and distant bits of the input data to analyze how they interact with each other. The model is used for translating texts and also foreshadowed the emergence of the GPT models.
- GPT
Generative pre-trained transformer (GPT) is based both on the Transformer and Pre-trained Transformer models that were not generative in nature. It was enhanced with unsupervised learning, in which training data is unlabelled. It was introduced by OpenAI in 2018.
- NeRF
Representing Scenes as Neural Radiance Fields (NeRF) is a model capable of synthesizing complex scenery. It’s based on querying 5D coordinates for generating scenery images, while its output consists of volume density and view-dependent emitted radiance.
- DALL-E
Based on the GPT-3 architecture, DALL-E consists of 12 billion parameters that allow it to interpret text and create images upon written user prompts.
- GitHub Copilot
In 2021 GitHub together with OpenAI launched a project dubbed CoPilot. It’s an assistant tool run by the Codex system and it can advise programmers on how to develop their code, by suggesting more fitting lines, functions, or other elements.
- MusicLM
MusicLM is a model designed for music synthesis upon prompt. Its architecture includes w2v-BERT to generate semantic tokens for audio, the audio-embedding network MuLan to enhance training, and other components.
- Sora
Sora is a diffusion-transformer model designed for video synthesis. It draws inspiration from how Large Language Models (LLMs) are trained on massive amounts of data.