A new study says students are likely writing millions of papers with AI

Students have submitted more than 22 million papers that may have used generative AI in the past year, new data released by plagiarism detection company Turnitin shows.

A year ago, Turnitin rolled out an AI writing detection tool that was trained on its trove of papers written by students as well as other AI-generated texts. Since then, more than 200 million papers have been reviewed by the detector, predominantly written by high school and college students. Turnitin found that 11 per cent may contain AI-written language in 20 per cent of its content, with 3 per cent of the total papers reviewed getting flagged for having 80 per cent or more AI writing. Turnitin says its detector has a false positive rate of less than 1 per cent when analyzing full documents.

ChatGPT’s launch was met with knee-jerk fears that the English class essay would die. The chatbot can synthesize information and distil it near-instantly—but that doesn’t mean it always gets it right. Generative AI has been known to hallucinate, creating its facts and citing academic references that don’t exist. Generative AI chatbots have also been caught spitting out biased text on gender and race. Despite those flaws, students have used chatbots for research, organizing ideas, and as a ghostwriter. Traces of chatbots have even been found in peer-reviewed, published academic writing.

Teachers understandably want to hold students accountable for using generative AI without permission or disclosure. However, that requires a reliable way to prove AI was used in a given assignment. Instructors have tried at times to find solutions to detecting AI in writing, using messy, untested methods to enforce rules, and distressing students. Further complicating the issue, some teachers are even using generative AI in their grading processes.

Detecting the use of gen AI is tricky. It’s not as easy as flagging plagiarism, because generated text is still original text. Plus, there’s nuance to how students use gen AI; some may ask chatbots to write their papers for them in large chunks or in full, while others may use the tools as an aid or a brainstorming partner.

Students also aren’t tempted by only ChatGPT and similar large language models. So-called word spinners are another type of AI software that rewrites text and may make it less obvious to a teacher that work was plagiarized or generated by AI. Turnitin’s AI detector has also been updated to detect word spinners, says Annie Chechitelli, the company’s chief product officer. It can also flag work that was rewritten by services like spell checker Grammarly, which now has its own generative AI tool. As familiar software increasingly adds generative AI components, what students can and can’t use becomes more muddled.

Detection tools themselves have a risk of bias. English language learners may be more likely to set them off; a 2023 study found a 61.3 per cent false positive rate when evaluating Test of English as a Foreign Language (TOEFL) exams with seven different AI detectors. The study did not examine Turnitin’s version. The company says it has trained its detector on writing from English language learners as well as native English speakers. A study published in October found that Turnitin was among the most accurate of 16 AI language detectors in a test that had the tool examine undergraduate papers and AI-generated papers.

Schools that use Turnitin had access to the AI detection software for a free pilot period, which ended at the start of this year. Chechitelli says a majority of the service’s clients have opted to purchase the AI detection. But the risks of false positives and bias against English learners have led some universities to ditch the tools for now. Montclair State University in New Jersey announced in November that it would pause the use of Turnitin’s AI detector. Vanderbilt University and Northwestern University did the same last summer.

“This is hard. I understand why people want a tool,” says Emily Isaacs, executive director of the Office of Faculty Excellence at Montclair State. But Isaacs says the university is concerned about potentially biased results from AI detectors, as well as the fact that the tools can’t provide confirmation the way they can with plagiarism. Plus, Montclair State doesn’t want to put a blanket ban on AI, which will have some place in academia. With time and more trust in the tools, the policies could change. “It’s not a forever decision, it’s a new decision,” Isaacs says.

Chechitelli says the Turnitin tool shouldn’t be the only consideration in passing or failing a student. Instead, it’s a chance for teachers to start conversations with students that touch on all of the nuances of using generative AI. “People don’t really know where that line should be,” she says.