Athens

How to Detect AI Writing (And Why It Doesn't Work)

- Moritz Wallawitsch

AI detection is a growth industry built on a broken premise. Schools pay for it. Publishers demand it. Entire companies exist to answer one question: did a human write this?

The answer, increasingly, is that nobody can tell. Not reliably. Not at scale. And the attempt to detect AI writing is causing more harm than the AI writing itself.

How AI Detection Tools Work

Every AI detector works the same way. It measures statistical patterns in text. Specifically, it looks at perplexity and burstiness.

Perplexity measures how predictable the next word is. If you read "The cat sat on the..." the next word is probably "mat." Low perplexity. Predictable. AI text tends to be more predictable because language models choose probable tokens by design.

Burstiness measures variation in sentence structure. Humans write with uneven rhythm. Short sentences. Then longer, more complex ones that wind through multiple clauses before arriving at a point. AI text tends to be more uniform. Sentences cluster around similar lengths and structures.

Turnitin, GPTZero, Originality.ai, and the rest all measure these two properties in various combinations. Some add additional signals. But the core approach is the same: statistical pattern matching on text features.

This approach has a fundamental problem that no amount of engineering can fix.

The False Positive Problem

Turnitin claims a false positive rate of less than 1%. This number appears in their marketing materials, their sales pitches to universities, and their published white papers.

The Washington Post tested this claim. They ran human-written text through AI detectors and found a false positive rate of roughly 50%. Half the time, human writing got flagged as AI-generated. Not 1%. Fifty percent.

How can these numbers be so far apart? Turnitin tests on curated datasets under controlled conditions. The real world is messier. Real students write on diverse topics, in diverse styles, with diverse levels of English proficiency. The controlled lab conditions that produce a 1% false positive rate do not hold up when millions of students submit millions of papers on thousands of topics.

The gap between claimed and actual performance is not small. It is the difference between a useful tool and a coin flip. Universities are making academic integrity decisions based on technology that performs little better than chance.

Who Gets Flagged

The false positives are not random. They cluster around specific groups of students.

ESL and ELL students get flagged 2 to 5 times more often than native English speakers. One study found that up to 32% of essays by non-native English writers were misclassified as AI-generated. The reason is straightforward. Non-native speakers often write with simpler vocabulary, more predictable sentence structures, and fewer of the idiosyncratic flourishes that detectors interpret as "human." They write correct, clear, grammatical English. And the detector reads that as AI.

Neurodivergent students also get flagged at higher rates. Students with autism, dyslexia, and ADHD often develop distinctive writing patterns that differ from neurotypical norms. Some write with unusual consistency. Others rely heavily on learned structures and templates. These patterns trigger the same statistical signals that detectors associate with AI output.

Think about what this means in practice. The students most likely to be falsely accused of cheating are the ones who already face the most barriers in academic settings. International students navigating a second language. Neurodivergent students who have worked hardest to develop their writing skills. These are the students least equipped to fight a false accusation and most harmed by one.

How Turnitin Hides the Problem

Turnitin knows about the false positive problem. Their solution is not to fix the detection. It is to suppress low-confidence results.

If Turnitin's detector returns a score between 1% and 19%, the system suppresses it. The instructor sees 0%. The actual detection result gets hidden because Turnitin determined that scores in this range are too unreliable to show.

This is a remarkable admission. They are saying that their detector produces results they do not trust in a significant percentage of cases. Rather than show instructors the uncertain results and let them decide, Turnitin quietly zeros them out. The 1% false positive claim gets easier to maintain when you throw away the results that would disprove it.

Vanderbilt University disabled Turnitin's AI detection feature entirely. Curtin University in Australia did the same. These are not small institutions. They evaluated the technology, looked at the evidence, and concluded it was not reliable enough to use.

Students Are Writing Worse on Purpose

A 9th grade English teacher posted on Reddit about a pattern emerging in their classroom. Students were deliberately writing worse to avoid AI detection flags. They added spelling errors on purpose. They removed transitions between paragraphs. They used choppy, disconnected sentences instead of flowing prose.

One student wrote a paragraph in front of the teacher, by hand, on paper. The teacher watched the entire process. Then they typed it up and ran it through the detector. It flagged 42% AI-generated. A paragraph written by hand, observed by the teacher, with zero AI involvement.

The teacher described students as "scared away from trying to improve their writing." Good writing and AI-flagged writing have converged on the same characteristics. Clear structure. Proper grammar. Logical flow. Varied vocabulary used appropriately. These are the things English teachers spend years teaching students to do. They are also the things AI detectors flag.

This is the most perverse outcome imaginable. An educational tool designed to promote academic integrity is teaching students that good writing is suspicious. Students are learning that the safest strategy is mediocrity. Write badly enough and nobody will accuse you of cheating.

This is not a few paranoid students overreacting. It is a rational response to an irrational system. When writing well gets you flagged and writing badly keeps you safe, students will write badly. The incentive structure is clear. And it is backwards.

Why Detection Fundamentally Cannot Work

The AI detection problem is not an engineering problem waiting for a better algorithm. It is a mathematical impossibility.

AI detectors measure statistical properties of text. Good human writing and good AI writing share the same statistical properties. Both are clear. Both are coherent. Both use proper grammar and logical structure. Both employ varied vocabulary and smooth transitions.

As AI models improve, their output becomes more human-like. That is literally the goal of AI research. Every improvement in language models makes the text harder to distinguish from human writing. The detector is chasing a target that moves away from it with every model update.

Consider the endpoint. A perfect AI writer produces text indistinguishable from perfect human writing. At that point, detection is logically impossible. You cannot distinguish two things that are identical. We are not at that endpoint yet, but we are close enough that the statistical signals detectors rely on are already unreliable.

Watermarking has been proposed as an alternative. AI companies could embed invisible statistical patterns in their output that detectors could identify. But watermarking requires cooperation from every AI provider. It can be defeated by paraphrasing. And it does not help with the fundamental problem: you still cannot detect AI-assisted writing where a human wrote the draft and AI helped edit it.

The better the writing, the more likely it is to be flagged. This is not a bug in AI detection. It is the design. The detectors are measuring quality signals and interpreting them as AI signals. For students using AI writing tools ethically, this creates an impossible situation.

The Institutional Response

Universities are starting to recognize the problem. Curtin University disabled AI detection and shifted to assessment redesign. Instead of trying to catch AI use after the fact, they redesigned assignments to make AI generation less useful. Oral examinations. Process portfolios. In-class writing components. Assignments tied to personal experience that AI cannot fabricate.

This is a better approach. It addresses the actual concern - are students learning? - rather than the proxy question of whether they used a specific tool. A student who uses AI to edit their draft and can then discuss that draft intelligently in an oral exam has demonstrated learning. A student who generated an essay with ChatGPT and cannot discuss it has not. The oral exam catches the second student. The AI detector catches neither reliably.

Vanderbilt, after disabling Turnitin's AI detection, published guidance encouraging faculty to focus on the learning process rather than policing the tools. They recommended more emphasis on drafts, revisions, and iterative feedback. This approach has the added benefit of being better pedagogy regardless of AI.

The schools that figure this out first will produce better-educated graduates. The schools that double down on detection will produce graduates who learned to write badly on purpose.

What This Means for Writers Outside Academia

AI detection is not just an academic problem. Publishers, content platforms, and employers are also using detection tools. Freelance writers report having work rejected by AI detectors despite writing every word themselves. The same false positive problems that affect students affect professional writers.

Some writers have started using AI humanizer tools to make their AI-assisted text pass detection. This is an arms race with no winner. The humanizer makes text less detectable. The detector improves. The humanizer adapts. The detector adapts. Meanwhile, the actual quality of the writing gets worse with each round of obfuscation.

The professional writing world is arriving at the same conclusion as academia: detection does not work and the attempt to make it work creates perverse incentives. The question is not whether someone used AI. The question is whether the output is good and whether the person can stand behind it.

The Real Solution

If detection does not work, what does? The answer is not better detection. It is better tools.

The problem with AI writing is not that AI is involved. It is that AI replaces the human instead of assisting them. When someone pastes a prompt into ChatGPT and submits the output as their own, the human did not write it. When someone writes a draft and then uses AI to improve their grammar, tighten their sentences, and suggest better word choices, the human did write it. The AI was an editor, not an author.

Research supports this distinction. As discussed in our analysis of AI and writing quality, writers who draft first and use AI to edit show higher brain activity than writers working without AI at all. The act of evaluating AI suggestions, accepting some and rejecting others, engages the brain more than writing alone. The order matters. Write first, then use AI.

This is the model that tools like Athens are built around. You write in the editor. You select text you want to improve. The AI suggests edits and shows you exactly what changed in a diff view. You accept or reject each change. The output is genuinely yours. You wrote it. The AI helped you make it better.

This workflow does not need detection because there is nothing to detect. The human is the author. The AI is a tool. The same way spell-check does not make your writing "spell-check-generated," an AI editor does not make your writing AI-generated. The ideas, structure, arguments, and voice are yours. The AI helped you express them more clearly.

What Should Change

Schools should stop using AI detection tools for disciplinary decisions. The false positive rates are too high, the bias against ESL and neurodivergent students is too clear, and the perverse incentives are too damaging.

Instead, schools should redesign assessments to measure actual learning. Oral examinations, process portfolios, and in-class writing components are all more reliable than algorithmic detection. They also produce better educational outcomes.

Schools should teach students to use AI as an editing tool, not a writing tool. This means teaching the write-first-then-edit workflow. It means showing students how to evaluate AI suggestions critically. It means treating AI literacy as a core skill rather than a threat to be policed.

Publishers and employers should evaluate writing on its quality, not its provenance. If a writer produces clear, accurate, well-structured work, the tools they used to get there are secondary. Professional writers have always used editors, proofreaders, and revision tools. AI is the latest in a long line of writing aids.

The Bottom Line

AI detection does not work. It flags good writing as suspicious. It disproportionately harms ESL students, neurodivergent students, and careful writers. It pushes people to write worse on purpose. And it will only get less accurate as AI models improve.

The solution is not better detection. It is better tools and better processes. Tools where humans write and AI edits. Processes that measure learning rather than policing tool use.

If you are a student worried about false flags, or a writer tired of the detection arms race, the answer is the same: use a tool that keeps you in control of the writing. Athens is built for exactly this. You write. AI edits. The output is yours. No detection needed.