Delve and Vocab Tics

ChatGPT uses "delve" 20x more than human writers. The word spiked 6,697% in PubMed abstracts after ChatGPT's launch. The most data-backed AI vocabulary tell we have.

AI models reach for certain words far more than humans do. "Delve" is the poster child, but it has company: "utilize," "leverage," "harness," "robust," "comprehensive," "furthermore," and dozens more.

Why "delve"? OpenAI outsourced its RLHF training to workers in Kenya and Nigeria, where the word is standard business English. Those annotators rated outputs containing "delve" as high quality. The model internalized the preference. A regional dialect quirk got laundered into a global language model's idea of professionalism.

FSU researchers Tom Juzek and Jeremy Ward traced the connection in a December 2024 ArXiv paper. They tracked 21 "focal words" across millions of PubMed abstracts and pinpointed the spikes to ChatGPT's November 2022 release date. Not a quirk -- a case study in how annotator demographics warp model behavior.

Academic abstract This study delves into the impacts of maintaining mean arterial blood pressure on patient outcomes in a robust clinical framework.
Blog post Let's delve into the intricacies of modern software architecture and explore how organizations can harness these powerful paradigms.
LinkedIn post I wanted to delve deeper into this topic. After leveraging AI tools and utilizing comprehensive frameworks, here's what I discovered about robust leadership strategies.
Stack of tells Furthermore, it is crucial to delve into the multifaceted nature of this pivotal development, which underscores the intricate interplay between innovation and meticulous execution.
6,697%
increase in "delve" in PubMed abstracts, 2020-2024
20x
ChatGPT-3.5 uses "delve" vs. human baseline (19.46 vs 0.98 opm)
13.5%
of 2024 PubMed abstracts estimated AI-processed (Kobak et al.)
379
excess style words identified across 15M PubMed abstracts

The FSU "Delve" Paper

Juzek and Ward (December 2024) analyzed millions of PubMed abstracts and found 21 "focal words" that spiked sharply after ChatGPT's November 2022 release. "Delve" had the most dramatic increase. The spike doesn't just correlate with AI use -- it correlates with who trained the AI. Nigerian English uses "delve" at far higher rates than American or British English, and OpenAI's annotation workforce was heavily Nigerian.

The Kobak Scale Study

Kobak et al. (Science Advances, 2025) went bigger -- 15 million PubMed abstracts, 379 "excess style words." By 2024, at least 13.5% of all PubMed abstracts showed signs of AI processing. The contaminated vocabulary goes well beyond "delve": "underscores," "aligns," "realm," "showcasing," "facilitates."

Why RLHF Creates Vocabulary Tells

RLHF works by having human annotators rate model outputs, then optimizing for higher-rated text. If your annotators share a linguistic background, their dialect preferences get amplified into the model's voice. The model has no idea it's absorbing Nigerian English conventions. It just knows "delve" gets rewarded.

PubMed Abstracts (2023-2024)

Thousands of published scientific abstracts now contain "delve" and its cohort. Not isolated incidents -- the FSU paper documented field-wide vocabulary distribution shifts starting in Q1 2023, visible across entire journals.

Kobak et al. study →

"Certainly, here is a possible introduction" Paper

An Elsevier paper in Surfaces and Interfaces shipped with the literal ChatGPT response prefix as its opening line. Unedited AI text, straight through peer review. Still published. Still unretracted.

Technology Networks →

Wikipedia Vocabulary Shifts

Wikipedia editors watched "delve" colonize thousands of new and edited articles. The WikiProject AI Cleanup team, founded in late 2023, now tracks vocabulary tells as a front-line detection signal. Their findings fed directly into Wikipedia's March 2026 ban on AI-generated article content, which passed 44-2.

TechCrunch →