Burger menu

AI Paraphrasers: Methods, Tools, Datasets, Metrics

An AI paraphraser is a tool capable of rewriting an already existing text to make it appear unique and authentic

What Are AI Paraphrasers?

Paraphrasing algorithm and its tasks

An AI paraphraser is a GenAI tool capable of rewriting text with different words, while retaining the original semantic meaning. Paraphrasing is divided into several types:

  • Same Polarity Substitutions. Words are replaced with synonyms.
  • Opposite Polarity Substitutions. Antonyms are used for word replacement.
  • Converse Substitution. Sentences are modified with relational pairs of a word.
  • Inflectional Changes. Number or verb tense inflection is altered.
  • Sentence Modality Changes. A change is made to the expression of perspectives related to the sentence subject.
  • Functional Word Substitution. A functional word — like a demonstrative pronoun — is substituted with another.
  • Structure/Discourse Changes. Referencing context of a phrase is altered.

Same polarity substitutions examples

Paraphrasing is actively used on par with AI text generation, which in turn widens the possibilities of academic dishonesty and plagiarism. According to the University of Alabama, 47% of students use paraphrasing in their essays.

Syntactic feature set

Paraphrasing Methods

Essentially, there are three main paraphrasing techniques:

  • Generation

At its core, paraphrasing is similar to monolingual machine translation. It relies on Multiple Sequence Alignment (MSA), which detects possible paraphrasing patterns with word lattice pairs —  a group of words that are the best candidates to retain initial semantic meaning.  

  • Identification 

This technique is used to find syntactic, semantic, and symbolic similarities within a text. Additionally, statistic values, such as word vector and distance, Grammar string similarity, and other techniques are used to achieve a better result.

  • Acquisition

This implies acquisition and learning of lexico-syntactic paraphrases. It is based on extraction of syntactic translation rules in statistical machine translation.

Synchronous parsing and grammar paraphrasing

The Most Popular and Effective AI Paraphrasing Tools

There is a wide variety of commercial paraphrasers available online, many of them free. They are often used in experiments dedicated to AI-text detection to add extra challenge.

  1. QuillBot

QuillBot is an online AI paraphrasing tool which can correct grammar and find better word choices. According to the project’s author, it’s used mostly by nonnative-English speakers to correct their writing; therefore; the amount of actual rephrasing completed by QuillBot is not as high as some other tools. 

  1. Paraphraser.io

Paraphraser is a platform that can summarize texts, check grammar, and rewrite articles. It’s also capable of creative writing as a premium feature. According to its website, it employs “advanced AI algorithms.”

  1. SciSpace

SciSpace positions itself as a rewriting tool for academic works. Its additional features include choosable stylistics, multilingual paraphrasing, and originality AI detector. The service allows authors to cultivate their individual writing style.

Synchronous paraphrastic derivation in sentence compression

  1. ZeroGPT

ZeroGPT was a subject of controversy when it turned out that it identified synthesized texts as human-written. The platform also provides a rephrasing tool that can work with sentences or even whole passages.

ZeroGPT’s interface and output

Datasets for Text Paraphrasing Task

There is rather a humble collection of paraphrasing datasets, as today’s research focuses on detecting texts generated from scratch. Some of the notable examples include:

  • MSCOCO. This is a Microsoft dataset which originally contains 120,000 pictures with captions for object detection. However, each caption exists in five instances written by five different annotators.
  • PPBD. This is a database created specifically for paraphrasing tasks. Apparently, several editions exist, including PPBD 2.0 and PPBD-TLDR
  • WikiAnswers. An extensive data corpus, WikiAnswers contains different questions that were marked by the WikiAnswers users as the same in essence.
  • Quora Questions Pairs/QQP. QQP presents 400,000 sentence pairs that were duplicate questions — different questions with the same semantic meaning — from Quora. 
A sample from the MSCOCO dataset

Other examples are SNLI, ChatGPT Paraphrases, and others. 

Evaluation Metrics of Paraphrase Generation

The evaluation metrics for paraphrased writing are BLEU, originally designed for automated translation; ROUGE, which initially focused on text summarization; TER, which serves to assess quality of machine translation; and METEOR, which does a satisfying job at checking semantic equivalents in the context of low-resource languages.  

Of course, with so much text being AI-generated or hybrid (a mix of GenAI and human editing,) the issue arises as to what constitutes infringement of copyrights. To read on about this new dilemma being posed by the increasing “authenticity” of GenAI tools, read our next article here

Try our AI Text Detector

Avatar Antispoofing

1 Followers

Editors at Antispoofing Wiki thoroughly review all featured materials before publishing to ensure accuracy and relevance.

Article contents

Hide

More from AI Generated Content

avatar Antispoofing Problems and Challenges of AI-Generated Text Detectors

Why Is It Important to Detect AI-Generated Text?  The GenAI era has exposed a multitude of problems that come along…

avatar Antispoofing Accuracy of AI-Generated Text Detectors

How Accurate Are AI-Generated Text Detectors? AI text detectors aim to distinguish written content generated with deep neural models: GPT,…

avatar Antispoofing Voice-Clone Spoofing in Financial Fraud

What is Voice Cloning and How It Can Be Used in Financial Fraud? Voice cloning is a machine learning-based technology…

avatar Antispoofing What Is Voice Cloning, and How Can We Detect It?

Voice cloning is a technology based on machine learning with the goal of seamlessly mimicking a person’s voice. Voice cloning…