ChatGPT uses "delve" 20x more than human writers. The word spiked 6,697% in PubMed abstracts after ChatGPT's launch. The most data-backed AI vocabulary tell we have.
The Pattern
AI models reach for certain words far more than humans do. "Delve" is the poster child, but it has company: "utilize," "leverage," "harness," "robust," "comprehensive," "furthermore," and dozens more.
Why "delve"? OpenAI outsourced its RLHF training to workers in Kenya and Nigeria, where the word is standard business English. Those annotators rated outputs containing "delve" as high quality. The model internalized the preference. A regional dialect quirk got laundered into a global language model's idea of professionalism.
FSU researchers Tom Juzek and Jeremy Ward traced the connection in a December 2024 ArXiv paper. They tracked 21 "focal words" across millions of PubMed abstracts and pinpointed the spikes to ChatGPT's November 2022 release date. Not a quirk -- a case study in how annotator demographics warp model behavior.
Examples
The Research
Juzek and Ward (December 2024) analyzed millions of PubMed abstracts and found 21 "focal words" that spiked sharply after ChatGPT's November 2022 release. "Delve" had the most dramatic increase. The spike doesn't just correlate with AI use -- it correlates with who trained the AI. Nigerian English uses "delve" at far higher rates than American or British English, and OpenAI's annotation workforce was heavily Nigerian.
Kobak et al. (Science Advances, 2025) went bigger -- 15 million PubMed abstracts, 379 "excess style words." By 2024, at least 13.5% of all PubMed abstracts showed signs of AI processing. The contaminated vocabulary goes well beyond "delve": "underscores," "aligns," "realm," "showcasing," "facilitates."
RLHF works by having human annotators rate model outputs, then optimizing for higher-rated text. If your annotators share a linguistic background, their dialect preferences get amplified into the model's voice. The model has no idea it's absorbing Nigerian English conventions. It just knows "delve" gets rewarded.
Caught in the Wild
Thousands of published scientific abstracts now contain "delve" and its cohort. Not isolated incidents -- the FSU paper documented field-wide vocabulary distribution shifts starting in Q1 2023, visible across entire journals.
Kobak et al. study →An Elsevier paper in Surfaces and Interfaces shipped with the literal ChatGPT response prefix as its opening line. Unedited AI text, straight through peer review. Still published. Still unretracted.
Technology Networks →Wikipedia editors watched "delve" colonize thousands of new and edited articles. The WikiProject AI Cleanup team, founded in late 2023, now tracks vocabulary tells as a front-line detection signal. Their findings fed directly into Wikipedia's March 2026 ban on AI-generated article content, which passed 44-2.
TechCrunch →Sources