A Russian open-source developer’s attempt to conceal AI-generated content using a new 'humanising' tool reveals the limitations of current document-level detection systems, highlighting challenges unique to Russian language and stylistic patterns.
A Russian open-source developer says his own tool for "humanising" AI-written text failed on a Habr article about AI-style writing, with the platform's automated moderation rejecting the post on 27 April as likely machine-generated. The episode has highlighted a larger problem for Russian-language publishing: tools that smooth out individual phrases can still miss the document-level patterns that automated classifiers are trained to spot.
The author said he had submitted a piece about Russian AI writing patterns after first running a draft through his own humaniser, a skill for Claude Code and OpenCode called humanizer-ru. He later rewrote one awkward repetition he had spotted himself, then noted the change in a postscript. Days later, Habr's auto-moderator responded that the publication could not pass moderation because most of the text was highly likely to have been created with a generative AI model.
That irony became the starting point for a broader audit of what the tool actually does and does not do. The developer says humanizer-ru was built to catch Russian bureaucratic prose, heavy nominalisations, repetitive genitive chains, overuse of "является", stale corporate phrasing and literal calques from English. He says only a small minority of rules from an earlier English-language humaniser transferred cleanly to Russian, forcing most of the logic to be rewritten for Russian morphology and style.
The article also argues that Russian AI markers differ in important ways from English ones. It points to administrative phrasing built around nouns rather than verbs, long genitive chains, repetitive use of "является" as a crutch, bureaucratic pronouns such as "данный" and "указанный", and phrase-level translations of English idioms that sound unnatural in Russian. In the author's view, these are not merely stylistic quirks but statistical signals that can expose machine-generated text.
After the ban, he says he analysed his own rejected post using five simple metrics that any classifier could measure. The piece, he says, showed a highly regular listicle structure, heavy use of em dashes, a large share of quoted AI-style examples, and paragraph lengths that barely varied. In other words, even after phrase-level cleanup, the document still looked like a template-heavy AI essay rather than a human draft with irregular rhythm and digressions.
That distinction matters because, as the author frames it, most humaniser tools work at sentence level while moderation systems judge the shape of the whole document. Local substitutions can make phrases sound more natural, but they do little if the article still has repeating section structures, uniformly sized paragraphs and a predictable information cadence. The piece argues that some subjects, especially explainers about AI patterns themselves, are structurally difficult to disguise because they must include examples of the very language a detector is looking for.
He says the episode led to a second version of the tool, v0.2, built around three layers: phrase-level rewriting, a document-level audit mode and a stored "voice passport" designed to preserve a writer's individual style across sessions. The new release also adds genre presets for places such as Habr, Telegram, email, vc.ru, LinkedIn and technical documentation. The developer presents that as a shift away from producing a generic "human" voice towards helping a tool adapt to a specific author and publication context.
The broader backdrop is a Russian-language AI ecosystem that is becoming more contested. Research from RuATD 2022 showed that Russian automatic text detection is a serious and active field, while other studies have found that people are often poor at recognising AI-generated self-presentations in professional and social contexts. At the same time, Russian regulators have been moving towards tighter control of foreign AI systems, and reporting in 2025 and 2026 has described both Habr's own hard line on AI-assisted posts and broader efforts to shape what kinds of machine-generated language are acceptable online.
Source Reference Map
Inspired by headline at: [1]
Sources by paragraph: - Paragraph 1: [2] - Paragraph 2: [1] - Paragraph 3: [1] - Paragraph 4: [1] - Paragraph 5: [1], [5] - Paragraph 6: [1], [6] - Paragraph 7: [2], [3], [4], [5], [6]
Source: Noah Wire Services