Can AI-Generated Text Be Detected by Plagiarism Checkers?
Plagiarism concerns in regard to GenAI have been raised repeatedly, especially since models capable of synthesizing media have become publicly available. Currently, AI-written texts aren’t acknowledged as intellectual property or original writing as they draw ideas, narratives, and stylistics from other people's works.
While the plagiarism and dishonesty concerns increase rapidly — especially as a threat to academic integrity — detecting AI-written papers has become an arduous task. In 2023, OpenAI announced that it couldn’t no longer detect the writing of its own brainchild GPT-4, which led to discontinuation of their Text classifier previously available online.
There’s a split in opinion on whether the synthesized texts will be ever detectable. In this regard, a pessimistic report from Zhang et al., states that Large Language Models (LLMs) progress too fast and one day will be indistinguishable from human authors.
Plagiarism Checkers and AI-Content Detection: The Difference
Plagiarism checkers are tuned to detect improper usage of other authors’ writing, especially in the scientific area. This is achievable via an extensive database — including a cross-reference database — of publications, against which a text in question is matched. The checkup identifies direct plagiarism, as well more subtle cases of paraphrasing.
AI content detectors serve to expose writings generated with the help of artificial Intelligence: ChatGPT, Claude, Jasper, and others. They are trained to spot characteristics inherent to machine writing: specific word choice, unnatural syntactic structures, monotony in narrative, and so on. Although, GenAI can also copy someone else’s work without giving a proper reference.
Experiments on Plagiarism Analysis of AI-Generated Texts
An experiment saw a group of GPT-written essays on 50 topics undergoing a plagiarism check by Turnitin and iThenticate, and additionally a generated content analysis performed by GPT. The results showed that plagiarism rate was in the range from 0% to 64%.
As for the generated content detection, ChatGPT could identify AI writing instantly. However, it doesn’t prove to be a panacea at detecting synthesized content — paraphrasing with the help of such techniques as inflectional changes or functional word substitution can render a GenAI check futile.
Unintentional Plagiarism
Unintentional plagiarism can take place if a publication lacks proper review and supervision, including those from the peer specialists. Concurrently, it’s hard to say whether an author who employs GenAI plagiarizes the AI’s work or not.
While the academic publishing company Springer Nature strictly rejects the idea of including an AI as an author, ChatGPT was credited as a co-writer at least once in the article that discusses if ChatGPT can successfully pass USMLE medical examination or not.
AI-Assisted Plagiarism
GenAI has already been accused of copyright infringement in the past after it’s been noted that image generators like Midjourney sometimes basically recreate already existing pictures based on pop-culture references.
A number of studies conducted at different times — the earliest exploring ChatGPT-2 from 2019 — note that GenAI poses a tangible threat of spreading plagiarism in higher education, while also being disruptive to academic integrity. One of the studies revealed that ChatGPT displays or at least simulates critical thinking with coherence, persuasiveness, depth of knowledge, and other parameters. This, in turn, can tremendously undermine academic honesty related to science publications.
ChatGPT as a Tool for Plagiarism
Another study evaluated ChatGPT’s prowess at competing with human students. 10 questions from 32 courses taught at the New York University Abu Dhabi (NYUAD) were answered by the living students and the AI alike. As it turned out, ChatGPT managed to outperform humans in 12 courses that required strong factual knowledge. In turn, this signalizes the growing AI-plagiarism issue that can be barely tackled as of now — AI writing is fairly easy to disguise either manually or with other deep learning tools.