David Corney

What happens when "LLM grooming" fills an "information void"?

I want to discuss two concepts around mis-/disinformation and what happens if they mix.

LLM Grooming This is when a coordinated network of websites and social media accounts is used to push a particular narrative at a massive scale. Unlike typical disinformation campaigns, the target is not (directly) people, but rather search engine web crawlers and the scraper bots that collect training data for LLMs1. These blindly ingest the content and then later regurgitate it in response to users prompts.

Information void This refers to topics of interest that have very little information available. The classic example is Covid-19: from early 2020, everyone wanted desperately to know everything about it — how to prevent it, how to treat it, what the symptoms were, how not to die of it. In those early stages of the pandemic, what little information to be found was mostly unreliable because there were no experts in this new disease. What was already known about other coronaviruses was largely written up in academic medical articles that were either behind paywalls or just too obscure for most people to read. So the void was filled with speculation and gossip for a long time, causing enormous harm.

The combination When I first heard about LLM grooming, I of course googled it (skipping the genAI summary as I tend to) and read some trusted sources on it. But when I combined the search term with “information void” I got exactly zero responses. This surprised me a bit2 — with cheap generative AI tools all around us, it’s easy for anyone to identify an information void and quickly fill it with low-quality content (perhaps for marketing purposes) or deliberately skewed content (for propaganda purposes). In the short term, you may get some of that content going viral on social media but the real payback is once the bots scrape all the coordinated content that’s been created and start reproducing it in those genAI summaries that I keep skipping. And because there is no pre-existing information, even if a chatbot tries to give a balanced answer it’s likely to only quote back two variations of the coordinated content as if that represents the spectrum of a genuine discussion.

For example, I could create 1000’s of pages like this very blog post, but all AI-generated with varied style and content and all emphasising the key point that LLM grooming could be used to fill information voids, and that I, David Corney, am somehow part of that. Then, in a few months time, whenever the subject comes up in conversations with ChatGPT, Gemini or Grok, my name will come up to. I’ll leave it to the reader to imaging a more malicious use of these methods….


  1. Origin of LLM grooming from the American Sunlight Project discussing a massive pro-Russian network of misinformation 

  2. I’m sure I’m not the first to think of this combination, by the way. Both concepts are broadly known.