12 Jan 2024

AI-detection tools fail at universities worldwide

AI detection tools are neither accurate nor reliable. It is the result of a comprehensive study of 14 AI detection software programs used by universities to detect AI-generated papers and exams. The implications and drawbacks of using detection tools for AI-generated text in academic settings are discussed.

Recent advancements in generative pre-trained transformer large language models have drawn attention to the potential dangers of unfair use of AI-generated content in academic settings. The need to detect and address this issue has become more urgent. To this end, a paper has examined the functionality of detection tools for AI-generated text and evaluated them based on accuracy and error analysis. The study aimed to determine whether existing tools could reliably differentiate between human-written text and ChatGPT-generated text, and whether machine translation and content obfuscation techniques affected the detection of AI-generated text.

The research covered a total of 12 publicly available tools and two widely used commercial systems, Turnitin and PlagiarismCheck. The findings revealed that the available detection tools were neither accurate nor reliable, and demonstrated a bias towards classifying the output as human-written rather than detecting AI-generated text. Additionally, the study showed that content obfuscation techniques significantly hindered the performance of these tools.

The study provides several noteworthy contributions. Firstly, it summarises current scientific and non-scientific efforts in the field, presenting one of the most comprehensive tests conducted to date. The research methodology employed was rigorous, relying on an original document set and encompassing a broad range of tools. Moreover, the study examines the implications and drawbacks of using detection tools for AI-generated text in academic environments.

The introduction highlights the vital role that higher education institutions play in shaping individuals’ personal and professional ethics, emphasising the importance of maintaining academic integrity. The introduction also addresses the potential threats posed to integrity by unauthorised content generation and acknowledges the recent advancements in AI tools, particularly generative pre-trained transformers like ChatGPT. While the use of AI tools is not automatically unethical, the study highlights the need to educate students about the ethical use of such tools.

Some educational institutions have taken measures to restrict or prohibit the use of ChatGPT, and conferences have explicitly banned AI-generated content in submissions. Consequently, the demand for detection tools for AI-generated text has increased, and numerous free online tools claim to offer such capabilities. However, the study cautions against relying solely on the results provided by these tools, as they are limited in their effectiveness. Some companies that offer detection tools acknowledge these limitations and encourage caution when taking punitive measures based solely on their results.

Consult research article: Testing of detection tools for AI-generated text.