Compulsive qualifying before every claim: "It's important to note that", "Generally speaking", "While this may vary", "It's worth noting". A direct artifact of RLHF safety training — models are rewarded for caution, producing text that hedges even when the claim is uncontroversial.
The Pattern
Ask a model whether exercise is good for you and it won't just say yes. It will say "Exercise is generally considered to be beneficial for most people, though individual circumstances may vary and it's important to consult with a healthcare professional." Five hedges on a claim no one disputes.
RLHF safety training did this. Models get penalized for confident claims that turn out wrong, so they learn to pre-qualify everything, even the obvious stuff. "It's important to note that" and "It's worth noting" now appear at such inflated rates that they've become detection signals on their own.
Strip the hedges from a typical AI paragraph and you lose 20-30% of the word count. You lose zero information.
Wikipedia's "Signs of AI writing" guide flags hedging language. tropes.fyi catalogs it under its own entry: "It's Worth Noting." The Augmented Educator put it on their list of ten telltale AI tells and traced it directly back to safety training.
The worst case: Vanderbilt's DEI office sent a ChatGPT-written condolence email after the Michigan State shooting in 2023. The model hedged its way through expressions of grief. It read like a legal disclaimer wearing a sympathy card as a hat.
Examples
The Research
The Augmented Educator's "10 Telltale Signs of AI-Generated Text" spells out the mechanism. During RLHF, models get penalized for overconfident claims that could be wrong or politically charged. So they learn to preface everything with qualifiers. The training never teaches them which claims actually need hedging and which ones don't. "The sky is blue" gets the same cautious treatment as a contested medical claim.
Wikipedia's "Signs of AI writing" guide calls out "It's worth noting" and its cousins as vocabulary tells. They show up at elevated rates in AI-generated article submissions, enough that editors now treat them as a first-pass filter.
tropes.fyi gave it its own entry: "It's Worth Noting." Their framing is blunt. The hedge adds zero information. It's filler, and it's a signal.
The PNAS study on LLM writing found that instruction tuning increases qualifying language compared to both base models and human writing. The "helpful, harmless, honest" training objective rewards hedging as a safety mechanism. It works for safety. It's terrible for prose.
Caught in the Wild
Vanderbilt's DEI office used ChatGPT to write a condolence email about the Michigan State shooting. The hedging and generic qualifiers were bad enough. Then people noticed the attribution line at the bottom: "Paraphrase from OpenAI's ChatGPT." One commenter nailed it: "sick and twisted irony to making a computer write your message about community and togetherness."
CNN →AI-generated academic text mimics scholarly caution, but badly. Real hedging is specific: "these results should be interpreted in light of the sample size limitation." AI hedging is generic: "it's important to note that results may vary." One is information. The other is noise.
Wikipedia editors flag hedging language as a routine AI tell. AI-generated articles shove in "It is generally considered" and "It is worth noting that" where the style guide demands direct factual statements. The hedges aren't just bad writing; they violate Wikipedia's own voice.
Wikipedia →Sources