Burger menu

Algorithms for Detecting AI-Generated Text

There’s a constellation of AI-text detectors, some of which show promising potential.

What Are the Main Algorithms for Detecting AI-Generated Texts?

Watermarking technique based on Adversarial Watermarking Transformer (AWT)

There are three main approaches to detect synthesized writing:

  • Watermarking. Hidden data is added to the text, which signals that it was AI-generated. Such a modification is invisible to the human eye. 
  • Statistical methods. Detectors based on the statistical outliers that look for artifacts that a GenAI can leave in a text. They focus on entropy, perplexity, and n-gram frequencies observable in a text. 
  • Classifying. Classifiers are specifically trained to distinguish human and synthetic written content with the help of datasets that contain numerous samples of both.

Virtually all existing detectors rely on these approaches. 

Is It Necessary to Distinguish AI-Generated and Human-Written Texts?

It is acknowledged that an ability to accurately detect synthetic writing is crucial. Large Language Models (LLMs) can be used to spawn fake news, create manipulative political commentary, produce materials, such as emails, used for phishing, or write code with embedded viruses and backdoors. Besides, generated texts contribute to academic dishonesty and can dilute the overall quality level of the scientific writing with inaccurate facts, fake citations, and straightforward plagiarism.

The Main Algorithms for Detecting AI-Generated Text


Algorithms for AI-text detection that are widely discussed in the research papers include:

  1. Watermarking

Watermarking is a method of inserting “invisible” signals into the text that can be recognized by a detector only. Watermarks can rely on metadata, semantics, or stenographic approaches. The proposed methods include using hash function for generating bit sequences, constructing a specific succession of sentences/paragraphs, converting an image-based watermark into a text string, employment of the adversarial training for creating a secret message, and others. 

  1. Statistical Outlier Detection Methods
AI-generated text (left) occupying the negative curvature of the log probability function

A solution dubbed Giant Language Model Test Room (GLTR) is based on statistical outlier analysis, which allows it to detect synthetic writing. It highlights the vocabulary and word sequences that would typically be utilized by an AI, giving a human observer insight into the true nature of the text in question. Another technique scrutinizes the probability function of an LLM that typically makes a generated text occupy the negative curvature in the log probability.

Word sequences written by an AI are highlighted by GLTR in blue and yellow
  1. Classifier Methods
Hybrid combination featuring the TF-IDF approach

Classifier models are discriminators trained to differentiate synthetic and human writing. Among the proposed solutions are the controllable text-generator dubbed Grover, Energy-based models trained for classification purposes, combination of the term frequency and inverse document frequency (TF-IDF) with deep learning architectures, and others.

  1. Retrieval-Based Detection Methods
An example of a paraphrasing attack

Retrieval-based detectors are called so because they retrieve similarities between the target text and a database of the previously synthesised writing so recurring sentences and passages written by an AI could be identified. It is reported that this approach successfully fends off paraphrasing attacks.

Effectiveness of Algorithms for Detecting AI-Generated Texts

A rather pessimistic view states that sooner or later AI detectors will not be able to identify synthetic texts due to generative models obtaining a higher level of sophistication that allows them to produce extremely human-like writing.

However, authors of the retrieval-based approach believe that the quality of a generated text becomes secondary, as their technique focuses on other and more subtle clues capable of exposing generated written content no matter how “humanized” it may seem.   

Try our AI Text Detector

Avatar Antispoofing

1 Followers

Editors at Antispoofing Wiki thoroughly review all featured materials before publishing to ensure accuracy and relevance.

Article contents

Hide

More from AI Generated Content

avatar Antispoofing The Main AI Generative Models

What Is Generative AI? Generative AI is a type of artificial intelligence based on deep learning. Its purpose is to…

avatar Antispoofing Spoofing Attacks on AI Text Detectors and Defense against Them

What Is a Spoofing Attack on an AI-Text Detector? AI text detector spoofing is a deliberate attempt of presenting a…

avatar Antispoofing Converting Speech and Text into Real Material Objects

Is It Possible to Transform Speech or Text to Real Material Objects? Transforming speech into objects is rather an elusive…

avatar Antispoofing Generative AI in Design, Engineering and Manufacturing

GenAI Usage in Manufacturing and Its Benefits Apart from producing multimedia, Generative AI has also been adopted to solve manufacturing-related…