Techniques and Tools for AI-Generated Text Detection

Is It Possible to Detect AI-Generated Text — And Why Is It Necessary?

Opinions vary on whether detection solutions can effectively spot AI texts. A skeptical view would take the position that at some point the difference between human writing and AI-made content will disappear as neural models progress at a lighting speed.

However, there are still promising solutions. One of the methods suggests using undetectable watermarks, a duo of paraphraser/detector engaged in adversarial training, and other methods.

Detecting AI-produced texts is essential as it can prevent the spread of disinformation content, mitigate the risks of corporate spoofing and phishing, help academic integrity stay intact, and stifle plagiarism -— both scientific and artistic.

Response of a Mark Twain-based LLM model

The Main Generated Text Detection Techniques

Generally speaking, there are three main approaches to text detection:

Supervised Detection

A standard methodology is employing a language model-based detector trained on datasets containing human/AI writing. However, this process is rather costly, demanding time and computational resources, as well as the effort required to collect samples. Besides, it is also vulnerable to attacks, namely paraphrasing and data poisoning.

Zero-shot Detection

This strategy allows for a detector that doesn’t require additional training or data samples. Instead, it focuses on the negative log probability curvature — a characteristic that is featured in synthesized texts. Zero-shot implies that the model can detect writing it has never dealt with before.

Retrieval-based Detection

In its core, this is a simple approach that matches a text in question against a vast database of content produced/stored by Large Language Models. However, it’s becoming increasingly impossible for this approach to keep up, as the heap of AI-powered texts is growing at an avalanche speed.

Watermarking

As mentioned earlier, watermarks can stay invisible to the human reader. This is possible due to a special cryptography-inspired algorithm. According to this method, the key will be secretly shared between a GenAI and text detector, while causing no degradation to the text. At the same time, there’s no guarantee that malicious actors won't be able to remove watermarks one way or another.

Tools and Models for AI-Generated Texts Detection

The proposed AI text detection tools include:

GLTR

Giant Language Model Test Room (GLTR) focuses on baseline statistical methods that can spot artifacts left by an AI generator. It also takes into consideration word probability and absolute rank. This is a visual tool that helps the viewer directly see which parts of the text were authored by a machine.

Grover

Grover is a tool based on GPT-2 and trained with a Real News dataset. Initially, it was conceived to generate fake news and then enhanced with a [CLS] token and other additions to work as a detector. As a result, it can identify 90% of its own fake news, unless a bigger generator is used.

RoBERTa-Based Model

Research showed that a discriminative model can surpass a generative one, as in the case of RoBERTa. It is a non-generative model, which shows higher flexibility after fine-tuning due to its bidirectional nature.

SeqXGPT

SeqXGPT is built upon Convolution and Self-attention networks that allow focusing on log probability lists. In turn, it makes it possible to check a text on the sentence level. It employs Perplexity extraction, feature encoding, and linear classification to capture AI input on the sentence level.

IDEATE

IDEATE is based on Internal and External Factual Structures. The model incorporates an intricate architecture consisting of hierarchical convolution coupled with mention-level subgraph convolution — it helps detect internal factual structures. As for external factual structures, they are captured by the entity-level subgraph convolution.

Multi-Faceted Approach

The current method incorporates a T5 paraphrasing model and an LLM working in tandem. While the former rewrites questions, the latter serves to provide generated answers. The final results are then compared to human answers on a cosine similarity principle, allowing the model to learn a difference between human/AI content.

AI-Generated Code Detection

GPTSniffer is an AI-code detector that consists of three key components: Extractor, Tokenizer, and Classifier. The last component trained with data taken from GitHub and generated with ChatGPT — it makes detection more efficient, as training was done on relevant code. Additionally, paired snippet training further increases accuracy of the solution.

Code example in Java generated by ChatGPT for training

Online Tools for AI-Generated Texts Detection

There is a plethora of text detectors available online: Writer.com, Gltr.io, CopyLeaks, PercentHuman, CrossPlag, Draft & Goal, and many others.

To read on about the newest developments in AI text detection, read on in our next article.